16:00:22 #startmeeting Infrastructure (2023-09-21) 16:00:22 Meeting started Thu Sep 28 16:00:22 2023 UTC. 16:00:22 This meeting is logged and archived in a public location. 16:00:22 The chair is pcreech. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions. 16:00:22 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:00:22 The meeting name has been set to 'infrastructure_(2023-09-21)' 16:00:22 #meetingname infrastructure 16:00:22 The meeting name has been set to 'infrastructure' 16:00:22 #chair nirik zlopez nb bodanel dtometzki jnsamyak lenkaseg patrikp 16:00:22 Current chairs: bodanel dtometzki jnsamyak lenkaseg nb nirik patrikp pcreech zlopez 16:00:22 #info Agenda is at: https://board.net/p/fedora-infra 16:00:22 #info About our team: https://docs.fedoraproject.org/en-US/cpe/ 16:00:40 morning all 16:00:45 Mornin! 16:00:55 hello together 16:00:58 .hi 16:01:00 dtometzki: dtometzki 'Damian Tometzki' 16:01:02 morning o/ 16:01:04 .hello pcreech17 16:01:07 pcreech: pcreech17 'Patrick Creech' 16:01:09 .hi 16:01:11 darknao: darknao 'Francois Andrieu' 16:02:20 what is the status with the matrix bridge ? 16:02:53 still down 16:03:24 last update I know of is from https://matrix.org/blog/2023/09/01/this-week-in-matrix-2023-09-01/ 16:04:39 On the plus side we are close to a matrix meeting bot. once thats ready we can move meetings over to matrix too. 16:04:48 .hi 16:04:49 phsmoura: phsmoura 'Pedro Moura' 16:04:57 nice! 16:05:15 #topic New folks introductions 16:05:15 #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves 16:05:15 #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted 16:05:29 I haven't had a chance recently to check the mailing list, any new folks? 16:06:15 no one new around recently that I saw... 16:06:31 .hello salimma 16:06:32 michel-slm: salimma 'Michel Lind' 16:07:22 cool! moving on! (sorry, just spilt rice all over the floor... the joys of multi-tasking!) 16:07:29 #topic Next chair 16:07:29 #info magic eight ball says: 16:07:29 #info chair 2023-09-28 pcreech 16:07:29 #info chair 2023-10-05 dtometzki 16:07:29 #info chair 2023-10-12 ??? 16:07:30 #info chair 2023-10-19 ??? 16:07:32 #info chair 2023-10-26 ??? 16:07:36 any takers on the 12, 19, or 26? 16:07:40 oops. ;( rice overload 16:08:10 I will be not here on the 19th... I could take the 12th or 26th if no one else can 16:08:45 I can take the 19th i think? 16:08:54 i'm hesitant on the 12, going out of town that friday 16:08:56 can i change my chair with anyone ? 16:09:17 I am on 10-05 on customer site 16:09:33 I could take the 5th. 16:10:01 perfekt then i will take the 10-12 16:10:10 many thanks nirik 16:10:15 (running the meeting on the 5th that is, not refusing to answer questions because they might incriminate me ) 16:10:52 lol 16:11:27 updates made on the agenda! 16:11:37 #topic announcements and information 16:11:38 #info CPE Infra&Releng EU-hours team has a Monday through Thursday 30 minute meeting going through tickets at 0730 UTC in #centos-meeting 16:11:38 #info CPE Infra&Releng NA-hours team has a Monday through Thursday 30 minute meeting going through tickets at 1800 UTC in #fedora-meeting-3 16:11:47 thanks pcreech 16:12:17 any ohter announcements or information? 16:12:30 np dtometzki :D 16:14:34 ok, assuming silence means no more announcements or info! 16:14:35 #info nirik out 2023-10-19 to 23th 16:14:42 lol... just as I say that :D 16:15:03 sorry, thinking slow today, need more coffee. 16:15:29 completely understand that... i've been drinking less coffee lately.... it's been rough 16:15:43 #topic Oncall 16:15:43 #info https://fedoraproject.org/wiki/Infrastructure/Oncall 16:15:43 #info https://docs.fedoraproject.org/en-US/cpe/day_to_day_fedora/ 16:15:43 ## .oncalltakeeu .oncalltakeus 16:15:43 #info pcreech is on call from 2023-09-22 to 2023-09-28 16:15:44 #info dtometzki is on call from 2023-09-29 to 2023-10-05 16:15:46 #info nirik is on call from 2023-10-06 to 2023-10-12 16:15:48 #info ??? is on call from 2023-10-13 to 2023-10-19 16:15:50 #info ??? is on call from 2023-10-20 to 2023-10-26 16:15:52 Oncall time! 16:16:27 can take the 10-13 16:16:40 I'm hesitant to take 13-19, due to afformentioned vacation.... cool. I'll then take the 20-26 16:17:35 looks like we have most of october covered for meetings and oncall! \o/ 16:17:52 #info Summary of last week: (from current oncall ) 16:18:50 Ok, so, I was only pinged once, a few hours before I got in for the day. There was some question about syncing between rhbz and fas, that appeared to resolve itself by time I got in 16:18:58 for the day today* 16:19:40 Yeah, If I get time I might look and see if I can see anything there, but it sounded resolved... 16:20:43 #topic Monitoring discussion [nirik] 16:20:43 #info https://nagios.fedoraproject.org/nagios 16:20:43 #info Go over existing out items and fix 16:21:18 lets see... 16:22:03 currently pretty quiet... just a bad disk in a old server, space on mailman01 (but still a fair bit) and a vpn endpoint down 16:22:16 so, not much to really report there. 16:22:45 cool 16:23:09 Looks like we have a learning topic on deck for today. nirik you good to go? 16:23:40 I don't really have anything typed out, but I'm happy to go and talk on it. ;) 16:23:48 #topic Learning topic 16:23:55 ## info current storage setup and backups [nirik] 16:24:51 so, I will talk about our backup setup today... feel free to ask questions or ask me to expand on anything. ;) 16:25:29 First, backups are a great thing. I recommend everyone have them. Additionally, if you do have backups, I recommend you try test restoring something from time to time to make sure they are actually working and backing up what you think they are. 16:25:48 There's a ton of backup software out there. 16:26:27 At the last time we decided, I picked rdiff-backup for our backups. It had some nice features and at the time was reasonably well thought of 16:26:54 It's still alive to this day amazingly enough (although upstream did go through a lot of upheaval a while back) 16:27:49 It's written in python and the way it works is it makes a backup of the indicated files, then the next incremental you make it saves only the differences and so on 16:27:58 So, it's pretty efficent space wise. 16:28:22 We run out backups from backup01 (a hw box) to a netapp volume. 16:28:35 it runs ansible to schedule/do the actual calls 16:29:00 you can see the playbook in playbooks/rdiff-backup.yml 16:29:20 so the backup server ssh's to those machines, runs rdiff-backup and backs things up. 16:29:53 On any servers we backup data, we also backup /etc and /home... this is because in the past sometimes people would say 'oh, I had that in my homedir' after something got reinstalled. 16:30:07 This works fine for many things, but two cases it doesn't handle so well: 16:30:38 1. our src.fedoraproject.org git repos. There's just too many small files changing too fast to backup this way. So, for them we use grokmirror to mirror them to the backup volume. 16:31:02 2. databases. For this we do a local db dump and save the dump to the backup volume. 16:31:38 Additionally, our netapp backup volume is mirrored to another datacenter, so we have backups if our main dc went offline for a long time 16:32:23 This has worked reasonably well, but rdiff-backup is sometimes slow, or has weird issues that require re-creating the backup dir. 16:32:44 at home I have been using restic, and it's pretty awesome. It's super fast and has a lot of nice features. 16:33:12 it has backends that allow you to save to things like s3 or backblaze, and can do encryption and compression and deduplication 16:33:51 * nirik tries to think of anything he's forgetting on the current setup. 16:33:56 any questions on all that? 16:34:15 * michel-slm likes restic 16:34:46 restic also lets you have a single repo if you want and backup multiple machines to it... then you get deduplication on all the stuff they share. 16:34:51 how long do we retain stuff right now? e.g. what's the oldest datestamp we can restore 16:35:14 and to add on to that, how often are full backups done vs diffs/deltas? 16:35:17 asking because I was wondering, if we ever switch to another backup solution, do we need to retain the old backups as well or just until they age out 16:35:28 Good question! Well, right now, we keep everything, and I keep meaning to do a prune, but... rdiff-backup is very slow to prune old snapshots. 16:36:04 I think keeping them for a year or two would be fine. 16:36:14 in practice we actually almost never restore anything 16:37:01 restic also lets you do a nicer set of pruning 16:37:17 like, keep every year, one from each month, etc 16:37:36 do we have backups of our netapp volumes (other than the backup volume itself) like the volumes used on openshift? 16:38:49 also good question. ;) it depends on the volume, we do backup some of them... but others we rely on snapshots. 16:39:28 so we have like periodic snapshots or is it just on-demand? 16:39:38 so right now we have backups since we moved to this dc 16:40:08 there is scheduled set of them... 16:40:33 I think its 4 weeks worth, 14 days, 24 hours by default? 16:41:27 that brings up our koji volume. its way too big to backup anywhere. it does have some limited snapshots and is mirrored to anothe dc... 16:41:33 good :) Do we backup our ceph volumes as well, or is there no demand on that? 16:41:44 (yet) 16:42:29 we aren't currently. mostly because we have said in the past that openshift apps shouldn't rely on storae...and most don't... db is external usually. 16:43:11 I'd like to look at restic to a local backup volume, then sync that or a subset to s3 16:43:53 oof, koji. That's a fun one... :D and I'm assuming it's the _whole_ data set since fc1 (or something) 16:43:58 that way we have stuff 'near' if we need it, but we don't take up as much space locally. 16:44:15 a lot of it's been moved to 'archive' volumes. 16:44:29 but somehow the stuff thats still in the main volume takes up a ton of room. 16:45:59 Oh, I forgot to mention: 16:47:07 list of hosts we backup is in inventory/backups and list of dirs (in addition to /home and /etc) we backup is in each hosts vars as host_backup_targets 16:47:59 any other questions? :) I should go get coffee before my next meeting at the top of the hour if not. 16:49:11 I think i'm good :D 16:49:19 #topic Open Floor 16:49:25 Open floor time! 16:49:55 Thanks everyone for coming! thanks nirik for that great discussion on backups! 16:50:00 nirik++ 16:50:43 nirik++ 16:50:43 darknao: Karma for kevin changed to 3 (for the release cycle f39): https://badges.fedoraproject.org/tags/cookie/any 16:52:06 thanks pcreech 16:52:25 thanks pcreech 16:52:42 :D 16:54:23 #endmeeting