16:00:11 <mkonecny> #startmeeting Infrastructure (2022-01-20)
16:00:11 <zodbot> Meeting started Thu Jan 20 16:00:11 2022 UTC.
16:00:11 <zodbot> This meeting is logged and archived in a public location.
16:00:11 <zodbot> The chair is mkonecny. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions.
16:00:11 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
16:00:11 <zodbot> The meeting name has been set to 'infrastructure_(2022-01-20)'
16:00:11 <mkonecny> #meetingname infrastructure
16:00:11 <zodbot> The meeting name has been set to 'infrastructure'
16:00:11 <mkonecny> #chair nirik siddharthvipul mobrien zlopez pingou bodanel dtometzki jnsamyak computerkid
16:00:11 <zodbot> Current chairs: bodanel computerkid dtometzki jnsamyak mkonecny mobrien nirik pingou siddharthvipul zlopez
16:00:15 <mkonecny> #info Agenda is at: https://board.net/p/fedora-infra
16:00:15 <mkonecny> #info About our team: https://docs.fedoraproject.org/en-US/cpe/
16:00:15 <mkonecny> #topic greetings!
16:00:20 <pmoura_> hello
16:00:26 <pmoura_> .hello phsmoura
16:00:30 <zodbot> pmoura_: phsmoura 'Pedro Moura' <moura.pedro123@gmail.com>
16:00:31 <nirik> morning everyone. 🌞
16:00:32 <petebuffon> .hello petebuffon
16:00:39 <zodbot> petebuffon: petebuffon 'Peter Buffon' <pabuffon@gmail.com>
16:00:44 <Saffronique> .hello dkirwan
16:00:44 <darknao> .hi
16:00:44 <zodbot> Saffronique: dkirwan 'David Kirwan' <davidkirwanirl@gmail.com>
16:00:47 <zodbot> darknao: darknao 'Francois Andrieu' <darknao@drkn.ninja>
16:00:54 <mkonecny> Welcome everyone to hottest news from Fedora Infrastructure
16:01:06 <mkonecny> I will be your host for today
16:01:35 <mkonecny> .hello zlopez
16:01:35 <mobrien> .hi
16:01:38 <zodbot> mkonecny: zlopez 'Michal Konecny' <michal.konecny@psmail.xyz>
16:01:41 <zodbot> mobrien: mobrien 'Mark O'Brien' <markobri@redhat.com>
16:02:24 <mkonecny> Let's look if there are any new members of our crew
16:02:25 <mkonecny> #topic New folks introductions
16:02:25 <mkonecny> #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
16:02:25 <mkonecny> #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted
16:02:36 <mkonecny> Anybody new here?
16:02:42 <mkonecny> Don't be shy :-)
16:03:12 <austinpowered> .hi
16:03:15 <zodbot> austinpowered: austinpowered 'T.C. Williams' <fedoraproject@wootenwilliams.com>
16:03:40 <lenkaseg> .hi
16:03:41 <zodbot> lenkaseg: lenkaseg 'Lenka Segura' <lenka@sepu.cz>
16:04:25 <mkonecny> It seems that we don't have anybody new here
16:04:39 <mkonecny> Let's look who will be the host of the next show
16:04:48 <mkonecny> #topic Next chair... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/7693d1e53e9656b4d5a9703d9a0d56e76f8dc28d)
16:05:54 <mkonecny> Anybody wants to host 2022-02-03?
16:06:00 <petebuffon> fyi irc cut off the last message
16:06:45 <mkonecny> I forgot, that the bridge doesn't allow you to send long messages
16:06:48 <mkonecny> #topic Next chair
16:06:48 <mkonecny> #info magic eight ball says:
16:06:48 <mkonecny> #info chair 2022-01-20 - mkonecny
16:07:00 <mkonecny> #info chair 2022-01-27 - jrichardson
16:07:00 <mkonecny> #info chair 2022-02-03 - ???
16:07:23 <lenkaseg> I can be the next next next chair :)
16:07:32 <mkonecny> And sold!
16:07:40 <dtometzki> hi together
16:07:54 <mkonecny> #info chair 2022-02-03 - lenkaseg
16:08:02 <dtometzki> i will do cair on 02-03
16:08:07 <dtometzki> chair
16:08:29 <mkonecny> This is already sold, but you can do 2022-02-10
16:08:31 <lenkaseg> or dtometzki then" I can be after that.
16:08:37 <dtometzki> yes
16:09:18 <mkonecny> dtometzki:  Just to make it clear, you want to take 2022-02-10?
16:09:27 <dtometzki> yes please
16:09:45 <mkonecny> #info chair 2022-02-10 - dtometzki
16:10:13 <mkonecny> And next on your program are the announcements
16:10:15 <mkonecny> #topic announcements and information
16:10:15 <mkonecny> #info CPE Infra&Releng EU-hours team has a Monday through Thursday 30 minute meeting going through tickets at 1030 Europe/paris in #centos-meeting
16:10:28 <mkonecny> #info CPE Infra&Releng NA-hours team has a Monday through Thursday 30 minute meeting going through tickets at 1800 UTC in #fedora-meeting-3
16:10:28 <mkonecny> #info If your team wants support from the Fedora Program Management Team, file an isssue: https://pagure.io/fedora-pgm/pgm_team/issues?template=support_request
16:10:32 <mkonecny> #info matrix sig is forming, see https://discussion.fedoraproject.org/t/bi-weekly-meeting-for-matrix-sig-or-matrix-team/35947/3 if interested
16:11:31 <mkonecny> Anything else?
16:12:28 <nirik> oh, one thing
16:12:34 <nirik> #info mass rebuild is underway
16:12:58 <jednorozec> here is the state of rebuild https://kojipkgs.fedoraproject.org/mass-rebuild/f36-need-rebuild.html
16:15:13 <mkonecny> Ok, let's continue
16:15:26 <mkonecny> Next on your list is friend on phone
16:15:32 <mkonecny> #topic Oncall
16:15:32 <mkonecny> #info https://fedoraproject.org/wiki/Infrastructure/Oncall
16:15:32 <mkonecny> #info https://docs.fedoraproject.org/en-US/cpe/day_to_day_fedora/
16:15:40 <mkonecny> #info darknao on call from 2022-01-13 to 2022-01-20
16:15:40 <mkonecny> #info petebuffon on call from 2022-01-20 to 2022-01-27
16:15:40 <mkonecny> #info ??? on call from 2022-01-27 to 2022-02-03
16:15:59 <nirik> I can take the next slot...
16:16:22 <mkonecny> It's yours
16:16:45 <mkonecny> #info nirik on call from 2022-01-27 to 2022-02-03
16:17:11 <mkonecny> #info Summary of last week: (from current oncall )
16:17:26 <darknao> nothing to report
16:17:29 <mkonecny> darknao:  Floor is yours
16:17:54 <darknao> no ping this last week
16:18:17 <mkonecny> Thanks for the report :-)
16:18:37 <mkonecny> Let's look at the situation in Fedora Infra
16:18:38 <mkonecny> #topic Monitoring discussion [nirik]
16:18:38 <mkonecny> #info https://nagios.fedoraproject.org/nagios
16:18:38 <mkonecny> #info Go over existing out items and fix
16:18:53 <nirik> ok, lets see
16:19:23 <nirik> a few arm builders down...
16:19:39 <nirik> a bunch of alerts due to a cert expiring. I'm working with digicert to try and renew it.
16:20:09 <nirik> some copr alerts, but I think thats being worked on
16:20:20 <nirik> and normal misc stuff...
16:20:39 <nirik> so, we can move on unless folks have questions...
16:22:01 <mkonecny> Thanks nirik
16:22:31 <mkonecny> Today we have a special guest who will be talking about New OpenShift Cluster
16:22:44 <mkonecny> #topic Learning topic
16:22:45 <mkonecny> #info 2022-01-20 - new OpenShift Cluster  [dkirwan]
16:22:52 <Saffronique> hah yes special!
16:23:00 <mkonecny> Please welcome Saffronique
16:23:17 <dtometzki> welcome
16:23:20 <mkonecny> The floor is yours
16:23:21 <Saffronique> Hello everyone my name is David Kirwan, I'm a member of the CPE team, I generally perfer to hide in teh background! But I got roped into giving a talk today! ;)
16:23:33 <petebuffon> hi!
16:23:44 <darknao> welcome Saffronique o/ :)
16:23:46 <Saffronique> https://gist.github.com/davidkirwan/38a195d733867fc707a08ed0ec73ee5a I put a few slides together
16:23:46 * nirik waves to Saffronique
16:24:06 <Saffronique> Can read along as I'm typing here, it will generally be the same information.
16:24:25 <Saffronique> Before I start I'm not an expert by any means, so I may not be able to answer questions if they are very technical, but I'll do my best.
16:24:37 <Saffronique> Ok, We recently deployed a new Openshift 4 cluster on a mix of VMs (Control Plane) and Baremetal (Workers).
16:24:55 <Saffronique> Most are aware already I'm sure, but just in case you are not, we currently run Openshift 3 in production.
16:25:07 <Saffronique> Our intention is to migrate all our applications over to this new cluster in the coming months.
16:25:22 <Saffronique> You can login to the clusters with your Noggin/IPA usernames, but by default you won't have the ability to run any workloads.
16:25:37 <Saffronique> We are working behind the scenes to get access to an Openshift cluster on which the community may run containers etc, but no estimate as to when this will be available.
16:25:50 <Saffronique> I hope this woll be a replacement for the Communishift we had previously.
16:26:17 <petebuffon> do you have a rough estimate on how many vms/baremetal machines?
16:26:26 <Saffronique> So at the moment we have 3 control plane VMs
16:26:39 <Saffronique> and 3 worker baremetal in production, and in staging
16:26:52 <petebuffon> ok thanks
16:26:56 <Saffronique> I can get the full specs for you later, I dont have them to hand currently
16:27:06 <Saffronique> We are in the process of adding 3 more to prod, and I believe 2 to staging?
16:27:15 <Saffronique> Just waiting to get the hardware on th e right vlan.
16:27:41 <Saffronique> Ok, Much is the same but there are some interesting differences.
16:27:57 <Saffronique> Openshift 4 includes many new technologies and features, one example is the support for Operators.
16:28:20 <Saffronique> Another big change is that RHEL CoreOS is the operating system which OCP4 now runs on top of.
16:28:33 <Saffronique> OCP4 self manages the RHCOS machines it runs on top of.
16:28:48 <Saffronique> So no more SSH'ing in to fix things manually!
16:29:12 <Saffronique> If we need customisations we should instead use MachineConfig to have Openshift make the changes on our behalf.
16:29:39 <Saffronique> Workloads can expose monitoring metrics and even set alerts based on these via User Workload Monitoring Stack
16:30:07 <Saffronique> I gave a talk recently internally at the Fedora Release Party: https://github.com/davidkirwan/asset_monitoring/blob/master/openshift-monitoring-stack/talk.md
16:30:38 <Saffronique> Ok quick overview of the nesxt big thing available in ocp4, is Operators
16:30:47 <Saffronique> If you are familiar with Kubernetes, without realising it you will already know what a Resource Controller is.
16:31:13 <Saffronique> Resource Controllers are the system logic which manages Kubernetes API object types
16:31:20 <Saffronique> eg: (Pod, Deployment, PersistentVolume, PersistentVolumeClaim etc).
16:31:34 <Saffronique> When you create a Deployment..
16:31:48 <Saffronique> Behind the scenes there is a resource controler which manages all Deployments and goes to work
16:31:56 <Saffronique> Well an Operator.. is a custom resource controller.
16:32:12 <Saffronique> When we install an operator, we extend the Kubernetes API, and allow us to add new features.
16:32:27 <Saffronique> We have a framework/sdk which we can use to build and develop these operators
16:32:36 <Saffronique> OCP4 has a catalog from which Operators maybe downloaded and installed.
16:32:58 <Saffronique> So from a users point of view, if you have hte correct permissions you can just pick and choose to install an Operator.. which will manage some other software you want.
16:33:03 <Saffronique> eg if you want a Postgres db..
16:33:16 <Saffronique> you can create a Postgres object (hypothetical example) in your namespace
16:33:28 <Saffronique> the Postgres operator will figure out whats needed and make it happen.
16:33:47 <Saffronique> We've got a couple of operators already installled.
16:33:56 <Saffronique> Here are the ones we've ear marked for use.
16:34:03 <Saffronique> Openshift Virtualisation (KubeVirt, kvm)
16:34:24 <Saffronique> This allows us to run VMs within Openshift
16:35:07 <Saffronique> The cluster is x86 only currently, but this is already going to be used by some of our first tenants in Fedora CoreOS.
16:35:19 <Saffronique> Local Storage Operator (Exposes the local disk storage on nodes as PersistentVolumes)
16:36:00 <Saffronique> On each of the worker nodes we have 8 SSD disks, the first is the root volume for RHCOS, and the remaining 7, will be formatted and exposted as PersistentVolumes via this Local Storage Operator.
16:36:24 <Saffronique> (white lie) On 2 of the 3 worker nodes) we have 8 disks.
16:36:44 <Saffronique> The 3 new nodes being added to prod are storage heavy, and will allow us to balance out this better.
16:36:52 <Saffronique> Openshift Data Foundation (Renamed in OCP4.9 from Openshift Container Storage, Ceph)
16:37:26 <Saffronique> This operator, takes all of the PVs made available by the local storage operator, and then installs Ceph basically.
16:37:39 <Saffronique> So we have a distributed storage built using all the spare disks on the worker nodes. Pretty neat
16:37:53 <Saffronique> We will be able to provision storage on demand, without infra tickets etc..
16:38:00 <Saffronique> And eventualy controled via Quotas.
16:38:16 <Saffronique> So umm thats all I got really, does anyone have any questions?
16:38:59 <mkonecny> Do we know the quotas?
16:39:25 <Saffronique> Not yet, but.. I hope to use some of the inital ideas that we put together for the CentOS CI OCP4 cluster.
16:39:43 <austinpowered> What account do you need to login?
16:39:45 <Saffronique> https://pagure.io/centos-infra/issue/8#
16:39:58 <Saffronique> Your Noggin/IPA account should work austinpowered
16:40:09 <austinpowered> FAS?
16:40:42 <Saffronique> Technically not FAS, but yes. The same user/pass you use to authenticate elsewhere in Fedora infra
16:40:53 <Saffronique> accounts.fedoraproject.org
16:41:28 <Saffronique> Just to be clear, while you can login, you won't be able to do anything.
16:41:33 <Saffronique> No quotas are in place by default.
16:42:02 <mobrien> FAS is the name of our old system but the name can be used interchangibly with noggin/ipa
16:42:02 <austinpowered> Understood. It would be helpful to see how things are done.
16:42:13 <Saffronique> But if you are owners of apps running on the current prodcution 3.11 cluster, now is a good time to think about migrating over to this new system!
16:43:14 <darknao> Speaking of that, do we have a schedule on when the migration will start ?
16:43:44 <mkonecny> Saffronique: We are thinking about it and starting next week, some people in Fedora Infra will start looking at it
16:43:45 <Saffronique> Not yet, I was actually talking with some others earlier that I'd like to write a proposal, to earmark some time so I can work on this within CPE
16:43:54 <Saffronique> oh nice mkonecny
16:44:11 <mkonecny> We already have few apps for which we have PRs
16:44:27 <mkonecny> https://pagure.io/fedora-infra/ansible/pull-request/844
16:44:33 <mkonecny> https://pagure.io/fedora-infra/ansible/pull-request/843
16:44:38 <mkonecny> https://pagure.io/fedora-infra/ansible/pull-request/842
16:44:43 <mkonecny> https://pagure.io/fedora-infra/ansible/pull-request/841
16:44:51 <mkonecny> We will start with those
16:44:53 <nirik> well, I'd like to get all the hardware in... but I guess we don't have to wait for that
16:45:23 <mkonecny> The decision is on you nirik :-)
16:46:08 <nirik> well, lets see... I'm going to work with networking later this morning to get all those new boxes on the right vlans.
16:46:36 <nirik> so, should know more after that. next week might work
16:46:57 <mkonecny> As planned :-)
16:47:18 <mobrien> Zlopez: I think we have crossed wires, the plan for next week was for me and nirik to get all the "new" hardware added to the ocp nodes rather than migrate the apps
16:47:42 <nirik> yeah, we still need to add em...
16:48:03 <mkonecny> Oh, I thought that the migration was part of it
16:48:21 <mkonecny> Mistake on my side then
16:48:58 <mobrien> Communication error from both of us I think 🙂
16:49:32 <mkonecny> If you need to resolve the hardware first, it makes sense
16:50:00 <mkonecny> Let's go to next topic for this meeting
16:50:01 <mkonecny> #topic Upcoming learning topics
16:50:01 <mkonecny> #info ?? - Fedora infra server monitoring [nirik]
16:50:05 <mkonecny> We have one upcoming learning topic
16:50:19 <mkonecny> nirik when do you want to do it?
16:50:44 <nirik> I could do it feb 3rd the same time I run the meeting?
16:50:55 <nirik> or could wait and do it next after that
16:51:10 <nirik> feb 17th? but thats a while out
16:51:24 <mkonecny> We can do it the 3rd and look at the backlog next week
16:52:57 <mkonecny> Do we have any idea for other learning topics?
16:53:35 <nirik> I wonder if we couldn't go over the current docs pipeline and then perhaps discuss ways to streamline it.
16:53:43 <nirik> darknao and I talked about this a while back...
16:54:42 <mkonecny> Good topic
16:55:14 <mkonecny> #info ??? - Docs pipeline [???]
16:55:35 <mkonecny> Who should talk about it?
16:55:40 <nirik> might be we could leverate s2i and use pods to distribute... not sure, might need prototyping.
16:56:02 <nirik> if darknao could that would be great... but if not I could try
16:56:56 <mkonecny> darknao:  Do you want to take this one?
16:57:05 <darknao> yeah I can try I guess
16:57:30 <mkonecny> Any preferred date?
16:57:57 <darknao> not yet ^^
16:58:19 <mkonecny> So let's keep it like this
16:58:20 <mkonecny> #info ??? - Docs pipeline [darknao]
16:58:28 <mkonecny> You can add date when you want
16:58:29 <nirik> I think feb 17th is the next non backlog one
16:59:00 <mkonecny> Should be
16:59:05 <darknao> alright, 17th of feb seems doable
16:59:50 <mkonecny> #info 2022-02-17 - Docs pipeline [darknao]
17:00:00 <mkonecny> Thanks
17:00:06 <mkonecny> We are at the end of your time
17:00:15 <mkonecny> Thanks everyone for attending
17:00:22 <mkonecny> I hope you liked this show :-)
17:00:26 <dtometzki> many thnaks mkonecny
17:00:28 <mkonecny> #endmeeting