16:00:11 #startmeeting Infrastructure (2022-01-20) 16:00:11 Meeting started Thu Jan 20 16:00:11 2022 UTC. 16:00:11 This meeting is logged and archived in a public location. 16:00:11 The chair is mkonecny. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions. 16:00:11 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:00:11 The meeting name has been set to 'infrastructure_(2022-01-20)' 16:00:11 #meetingname infrastructure 16:00:11 The meeting name has been set to 'infrastructure' 16:00:11 #chair nirik siddharthvipul mobrien zlopez pingou bodanel dtometzki jnsamyak computerkid 16:00:11 Current chairs: bodanel computerkid dtometzki jnsamyak mkonecny mobrien nirik pingou siddharthvipul zlopez 16:00:15 #info Agenda is at: https://board.net/p/fedora-infra 16:00:15 #info About our team: https://docs.fedoraproject.org/en-US/cpe/ 16:00:15 #topic greetings! 16:00:20 hello 16:00:26 .hello phsmoura 16:00:30 pmoura_: phsmoura 'Pedro Moura' 16:00:31 morning everyone. 🌞 16:00:32 .hello petebuffon 16:00:39 petebuffon: petebuffon 'Peter Buffon' 16:00:44 .hello dkirwan 16:00:44 .hi 16:00:44 Saffronique: dkirwan 'David Kirwan' 16:00:47 darknao: darknao 'Francois Andrieu' 16:00:54 Welcome everyone to hottest news from Fedora Infrastructure 16:01:06 I will be your host for today 16:01:35 .hello zlopez 16:01:35 .hi 16:01:38 mkonecny: zlopez 'Michal Konecny' 16:01:41 mobrien: mobrien 'Mark O'Brien' 16:02:24 Let's look if there are any new members of our crew 16:02:25 #topic New folks introductions 16:02:25 #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves 16:02:25 #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted 16:02:36 Anybody new here? 16:02:42 Don't be shy :-) 16:03:12 .hi 16:03:15 austinpowered: austinpowered 'T.C. Williams' 16:03:40 .hi 16:03:41 lenkaseg: lenkaseg 'Lenka Segura' 16:04:25 It seems that we don't have anybody new here 16:04:39 Let's look who will be the host of the next show 16:04:48 #topic Next chair... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/7693d1e53e9656b4d5a9703d9a0d56e76f8dc28d) 16:05:54 Anybody wants to host 2022-02-03? 16:06:00 fyi irc cut off the last message 16:06:45 I forgot, that the bridge doesn't allow you to send long messages 16:06:48 #topic Next chair 16:06:48 #info magic eight ball says: 16:06:48 #info chair 2022-01-20 - mkonecny 16:07:00 #info chair 2022-01-27 - jrichardson 16:07:00 #info chair 2022-02-03 - ??? 16:07:23 I can be the next next next chair :) 16:07:32 And sold! 16:07:40 hi together 16:07:54 #info chair 2022-02-03 - lenkaseg 16:08:02 i will do cair on 02-03 16:08:07 chair 16:08:29 This is already sold, but you can do 2022-02-10 16:08:31 or dtometzki then" I can be after that. 16:08:37 yes 16:09:18 dtometzki: Just to make it clear, you want to take 2022-02-10? 16:09:27 yes please 16:09:45 #info chair 2022-02-10 - dtometzki 16:10:13 And next on your program are the announcements 16:10:15 #topic announcements and information 16:10:15 #info CPE Infra&Releng EU-hours team has a Monday through Thursday 30 minute meeting going through tickets at 1030 Europe/paris in #centos-meeting 16:10:28 #info CPE Infra&Releng NA-hours team has a Monday through Thursday 30 minute meeting going through tickets at 1800 UTC in #fedora-meeting-3 16:10:28 #info If your team wants support from the Fedora Program Management Team, file an isssue: https://pagure.io/fedora-pgm/pgm_team/issues?template=support_request 16:10:32 #info matrix sig is forming, see https://discussion.fedoraproject.org/t/bi-weekly-meeting-for-matrix-sig-or-matrix-team/35947/3 if interested 16:11:31 Anything else? 16:12:28 oh, one thing 16:12:34 #info mass rebuild is underway 16:12:58 here is the state of rebuild https://kojipkgs.fedoraproject.org/mass-rebuild/f36-need-rebuild.html 16:15:13 Ok, let's continue 16:15:26 Next on your list is friend on phone 16:15:32 #topic Oncall 16:15:32 #info https://fedoraproject.org/wiki/Infrastructure/Oncall 16:15:32 #info https://docs.fedoraproject.org/en-US/cpe/day_to_day_fedora/ 16:15:40 #info darknao on call from 2022-01-13 to 2022-01-20 16:15:40 #info petebuffon on call from 2022-01-20 to 2022-01-27 16:15:40 #info ??? on call from 2022-01-27 to 2022-02-03 16:15:59 I can take the next slot... 16:16:22 It's yours 16:16:45 #info nirik on call from 2022-01-27 to 2022-02-03 16:17:11 #info Summary of last week: (from current oncall ) 16:17:26 nothing to report 16:17:29 darknao: Floor is yours 16:17:54 no ping this last week 16:18:17 Thanks for the report :-) 16:18:37 Let's look at the situation in Fedora Infra 16:18:38 #topic Monitoring discussion [nirik] 16:18:38 #info https://nagios.fedoraproject.org/nagios 16:18:38 #info Go over existing out items and fix 16:18:53 ok, lets see 16:19:23 a few arm builders down... 16:19:39 a bunch of alerts due to a cert expiring. I'm working with digicert to try and renew it. 16:20:09 some copr alerts, but I think thats being worked on 16:20:20 and normal misc stuff... 16:20:39 so, we can move on unless folks have questions... 16:22:01 Thanks nirik 16:22:31 Today we have a special guest who will be talking about New OpenShift Cluster 16:22:44 #topic Learning topic 16:22:45 #info 2022-01-20 - new OpenShift Cluster [dkirwan] 16:22:52 hah yes special! 16:23:00 Please welcome Saffronique 16:23:17 welcome 16:23:20 The floor is yours 16:23:21 Hello everyone my name is David Kirwan, I'm a member of the CPE team, I generally perfer to hide in teh background! But I got roped into giving a talk today! ;) 16:23:33 hi! 16:23:44 welcome Saffronique o/ :) 16:23:46 https://gist.github.com/davidkirwan/38a195d733867fc707a08ed0ec73ee5a I put a few slides together 16:23:46 * nirik waves to Saffronique 16:24:06 Can read along as I'm typing here, it will generally be the same information. 16:24:25 Before I start I'm not an expert by any means, so I may not be able to answer questions if they are very technical, but I'll do my best. 16:24:37 Ok, We recently deployed a new Openshift 4 cluster on a mix of VMs (Control Plane) and Baremetal (Workers). 16:24:55 Most are aware already I'm sure, but just in case you are not, we currently run Openshift 3 in production. 16:25:07 Our intention is to migrate all our applications over to this new cluster in the coming months. 16:25:22 You can login to the clusters with your Noggin/IPA usernames, but by default you won't have the ability to run any workloads. 16:25:37 We are working behind the scenes to get access to an Openshift cluster on which the community may run containers etc, but no estimate as to when this will be available. 16:25:50 I hope this woll be a replacement for the Communishift we had previously. 16:26:17 do you have a rough estimate on how many vms/baremetal machines? 16:26:26 So at the moment we have 3 control plane VMs 16:26:39 and 3 worker baremetal in production, and in staging 16:26:52 ok thanks 16:26:56 I can get the full specs for you later, I dont have them to hand currently 16:27:06 We are in the process of adding 3 more to prod, and I believe 2 to staging? 16:27:15 Just waiting to get the hardware on th e right vlan. 16:27:41 Ok, Much is the same but there are some interesting differences. 16:27:57 Openshift 4 includes many new technologies and features, one example is the support for Operators. 16:28:20 Another big change is that RHEL CoreOS is the operating system which OCP4 now runs on top of. 16:28:33 OCP4 self manages the RHCOS machines it runs on top of. 16:28:48 So no more SSH'ing in to fix things manually! 16:29:12 If we need customisations we should instead use MachineConfig to have Openshift make the changes on our behalf. 16:29:39 Workloads can expose monitoring metrics and even set alerts based on these via User Workload Monitoring Stack 16:30:07 I gave a talk recently internally at the Fedora Release Party: https://github.com/davidkirwan/asset_monitoring/blob/master/openshift-monitoring-stack/talk.md 16:30:38 Ok quick overview of the nesxt big thing available in ocp4, is Operators 16:30:47 If you are familiar with Kubernetes, without realising it you will already know what a Resource Controller is. 16:31:13 Resource Controllers are the system logic which manages Kubernetes API object types 16:31:20 eg: (Pod, Deployment, PersistentVolume, PersistentVolumeClaim etc). 16:31:34 When you create a Deployment.. 16:31:48 Behind the scenes there is a resource controler which manages all Deployments and goes to work 16:31:56 Well an Operator.. is a custom resource controller. 16:32:12 When we install an operator, we extend the Kubernetes API, and allow us to add new features. 16:32:27 We have a framework/sdk which we can use to build and develop these operators 16:32:36 OCP4 has a catalog from which Operators maybe downloaded and installed. 16:32:58 So from a users point of view, if you have hte correct permissions you can just pick and choose to install an Operator.. which will manage some other software you want. 16:33:03 eg if you want a Postgres db.. 16:33:16 you can create a Postgres object (hypothetical example) in your namespace 16:33:28 the Postgres operator will figure out whats needed and make it happen. 16:33:47 We've got a couple of operators already installled. 16:33:56 Here are the ones we've ear marked for use. 16:34:03 Openshift Virtualisation (KubeVirt, kvm) 16:34:24 This allows us to run VMs within Openshift 16:35:07 The cluster is x86 only currently, but this is already going to be used by some of our first tenants in Fedora CoreOS. 16:35:19 Local Storage Operator (Exposes the local disk storage on nodes as PersistentVolumes) 16:36:00 On each of the worker nodes we have 8 SSD disks, the first is the root volume for RHCOS, and the remaining 7, will be formatted and exposted as PersistentVolumes via this Local Storage Operator. 16:36:24 (white lie) On 2 of the 3 worker nodes) we have 8 disks. 16:36:44 The 3 new nodes being added to prod are storage heavy, and will allow us to balance out this better. 16:36:52 Openshift Data Foundation (Renamed in OCP4.9 from Openshift Container Storage, Ceph) 16:37:26 This operator, takes all of the PVs made available by the local storage operator, and then installs Ceph basically. 16:37:39 So we have a distributed storage built using all the spare disks on the worker nodes. Pretty neat 16:37:53 We will be able to provision storage on demand, without infra tickets etc.. 16:38:00 And eventualy controled via Quotas. 16:38:16 So umm thats all I got really, does anyone have any questions? 16:38:59 Do we know the quotas? 16:39:25 Not yet, but.. I hope to use some of the inital ideas that we put together for the CentOS CI OCP4 cluster. 16:39:43 What account do you need to login? 16:39:45 https://pagure.io/centos-infra/issue/8# 16:39:58 Your Noggin/IPA account should work austinpowered 16:40:09 FAS? 16:40:42 Technically not FAS, but yes. The same user/pass you use to authenticate elsewhere in Fedora infra 16:40:53 accounts.fedoraproject.org 16:41:28 Just to be clear, while you can login, you won't be able to do anything. 16:41:33 No quotas are in place by default. 16:42:02 FAS is the name of our old system but the name can be used interchangibly with noggin/ipa 16:42:02 Understood. It would be helpful to see how things are done. 16:42:13 But if you are owners of apps running on the current prodcution 3.11 cluster, now is a good time to think about migrating over to this new system! 16:43:14 Speaking of that, do we have a schedule on when the migration will start ? 16:43:44 Saffronique: We are thinking about it and starting next week, some people in Fedora Infra will start looking at it 16:43:45 Not yet, I was actually talking with some others earlier that I'd like to write a proposal, to earmark some time so I can work on this within CPE 16:43:54 oh nice mkonecny 16:44:11 We already have few apps for which we have PRs 16:44:27 https://pagure.io/fedora-infra/ansible/pull-request/844 16:44:33 https://pagure.io/fedora-infra/ansible/pull-request/843 16:44:38 https://pagure.io/fedora-infra/ansible/pull-request/842 16:44:43 https://pagure.io/fedora-infra/ansible/pull-request/841 16:44:51 We will start with those 16:44:53 well, I'd like to get all the hardware in... but I guess we don't have to wait for that 16:45:23 The decision is on you nirik :-) 16:46:08 well, lets see... I'm going to work with networking later this morning to get all those new boxes on the right vlans. 16:46:36 so, should know more after that. next week might work 16:46:57 As planned :-) 16:47:18 Zlopez: I think we have crossed wires, the plan for next week was for me and nirik to get all the "new" hardware added to the ocp nodes rather than migrate the apps 16:47:42 yeah, we still need to add em... 16:48:03 Oh, I thought that the migration was part of it 16:48:21 Mistake on my side then 16:48:58 Communication error from both of us I think 🙂 16:49:32 If you need to resolve the hardware first, it makes sense 16:50:00 Let's go to next topic for this meeting 16:50:01 #topic Upcoming learning topics 16:50:01 #info ?? - Fedora infra server monitoring [nirik] 16:50:05 We have one upcoming learning topic 16:50:19 nirik when do you want to do it? 16:50:44 I could do it feb 3rd the same time I run the meeting? 16:50:55 or could wait and do it next after that 16:51:10 feb 17th? but thats a while out 16:51:24 We can do it the 3rd and look at the backlog next week 16:52:57 Do we have any idea for other learning topics? 16:53:35 I wonder if we couldn't go over the current docs pipeline and then perhaps discuss ways to streamline it. 16:53:43 darknao and I talked about this a while back... 16:54:42 Good topic 16:55:14 #info ??? - Docs pipeline [???] 16:55:35 Who should talk about it? 16:55:40 might be we could leverate s2i and use pods to distribute... not sure, might need prototyping. 16:56:02 if darknao could that would be great... but if not I could try 16:56:56 darknao: Do you want to take this one? 16:57:05 yeah I can try I guess 16:57:30 Any preferred date? 16:57:57 not yet ^^ 16:58:19 So let's keep it like this 16:58:20 #info ??? - Docs pipeline [darknao] 16:58:28 You can add date when you want 16:58:29 I think feb 17th is the next non backlog one 16:59:00 Should be 16:59:05 alright, 17th of feb seems doable 16:59:50 #info 2022-02-17 - Docs pipeline [darknao] 17:00:00 Thanks 17:00:06 We are at the end of your time 17:00:15 Thanks everyone for attending 17:00:22 I hope you liked this show :-) 17:00:26 many thnaks mkonecny 17:00:28 #endmeeting