15:00:36 #startmeeting Container Security 15:00:36 Meeting started Wed Aug 12 15:00:36 2015 UTC. The chair is jsmith. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:36 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:01:10 #info Dan Walsh using the example of the Three Little Pigs to explain container security 15:01:21 #info Chapter 1: When should I use containers versus virtual machines? 15:01:40 #info Chapter 2: What platform should host my containers? 15:01:53 #info Chapter 3: How do I ensure container separation? 15:02:24 #info Chapter 4: Images and container content 15:03:36 Glossary: pig == application 15:03:49 Standalone omes == separate physical machines 15:03:56 Duplex home == Virtual Machines 15:04:46 Virtual machines have a very small attack surface on the host kernel 15:06:05 Aparentment building == containers 15:06:25 Because containers are talking directly to the host kernel, they're always slightly more dangerous than running VMs 15:07:33 Hostel == services running on the same machine 15:08:04 Sleeping in the park == setenforce 0 15:08:58 We'll focus on security in the apartment building (using containers) 15:10:07 #topic Chapter Two: What kind of apartment building 15:10:58 Apartment buildings built of straw == running containers on do it yourself platform 15:11:04 Who is doing updates? 15:11:11 Is your kernel secure? 15:11:22 Who is focusing on updates and stability? You? 15:11:47 Apartment buildings built of sticks == running containers on community platform 15:12:14 Better updates (for a limited time) 15:12:23 Brick == RHEL 15:13:52 #topic Chapter 3: How do I separate/secure pig apartments? 15:14:11 Containers don't contain :-p 15:14:36 (well, at least as not as VMs, but better than individual services running on the same machine without SELinux running) 15:14:50 There are lots of people running containers and forgetting everything they learned about security 15:14:56 Things like: 15:15:10 * Running more privileged processes 15:15:17 * Running random crap from the internet as root 15:15:31 Treat container services like regular services: 15:15:40 * Drop privilidges as quickly as possible 15:15:50 * Run your services as non-root whenever possible 15:16:01 * Treat root within a container the same as root outside of the container 15:16:44 "Docker is about running random crap from the internet as root on your host" -- Simo 15:19:04 Why don't containers contain? 15:19:11 Because everything in Linux is not namespaced 15:19:40 Kernel file systems (/sys, /sys/fs, /proc/sys) are not namespaced 15:19:53 cgroups, SELinux, /dev/mem, kernel modules as additional examples 15:20:02 Read only mount points 15:21:00 Capabilities... 15:23:29 A bunch of capabilities have been dropped from containers... 15:24:07 CAP_SYS_ADMIN and CAP_NET_ADMIN as examples 15:24:44 Can't run mount command under containers, can't load kernel modules, etc. 15:25:07 There are about 10 capabilities left that *are* available to root within a container 15:25:16 #info Namespaces 15:25:30 PID name space restricts ability to see other processes 15:25:35 Network name space -- separate network 15:26:02 #info Cgroups 15:26:10 Device Cgroup 15:26:15 Should have been a namespace 15:26:25 Controls which device nodes can be created within namespace 15:27:03 Inside of a container, you're only going to see certain device nodes (/dev/console, /dev/zero, /dev/null, etc.) 15:27:23 All images mounted with nodev 15:27:28 #info SELinux 15:29:12 Dan goes through the SELinux coloring book examples of cats and dogs 15:29:59 Type enforcement 15:30:07 Protects the host from container processes 15:31:30 Container processes can only read/execute /usr files 15:31:42 Container processes only write to container files 15:32:38 SELinux type enforcement doesn't keep containers from attacking each other 15:32:47 SVirt (MCS enforcement) does 15:33:21 It takes another part of the SELinux label (after the colon) 15:34:37 Docker assigns MCS label to all content (different MCS label for each container) 15:34:50 Launches the container processes with the same label 15:35:23 #info Future 15:35:31 SECCOMP 15:35:46 Shrink the attack surface on the kernel by eliminating syscalls 15:36:29 Developed by Google for the Chromium plugins 15:39:25 SECCOMP will be in Docker shortly 15:40:19 User name space -- no the holy grail 15:40:30 There is still no filesystem support 15:41:05 Ranges of UIDs still isn't supported 15:41:33 Management of multiple UID ranges, different identities, and filesystems still cause lots of problems 15:42:52 Anything world readable on any container can be read by any other container -- user name spaces won't protect you (but MCS would) 15:43:16 #topic Chapter 4: How do you furnish the pigs apartments? 15:45:29 Docker admins seeem to forget about security (or the lessons learned over the past couple of decades) 15:45:35 Don't forget the "ops" in "devops" 15:46:56 Just because you update the software on your host machines doesn't mean it's being updated in the containers 15:47:16 Large percentage of docker images have security vulnerabilites (30%+) 15:48:16 How often do we as a distribution update container images? 15:48:23 How do you know a container image is out of date? 15:51:54 #topic Q and A 15:52:52 Question about kernel audit -- is it a shared service? 15:53:08 Right now, there is nothing different about container auditing than with host auditing 17:32:25 jsmith: Error: Can't start another meeting, one is in progress. 17:33:15 #topic Core Toolchain and toolchain features 17:33:33 As far as glibc/gcc/etc. we're pretty much on par with x86 17:33:41 PPC doesn't have full seccomp support, should land shortly 17:34:03 On a functional level, however, there's very little difference between arches 17:34:42 Golang, go, docker, and assorted projects 17:35:46 Server edition... 17:36:14 ... basically the same across arches -- some small differences that we're working on. 17:36:26 On the Power side of things we're 100%, aarch64 is close to 100% 17:36:32 Very stable, generally just works 17:36:53 Enabling more and more functionality with containers and RoleKit, but for all intents and purposes server edition is complete 17:36:57 Cloud edition... 17:37:07 On the Power side of things, first cloud images released as part of 22 17:37:29 Didn't quite make it there with some of the tools on aarch64 due to issues of console support, EFI bootloader support, various other bits and pieces 17:37:44 Should see qemu-based images for aarch64 for F23 17:37:53 as well as Docker images for both power and aarch64 for F23 17:40:37 Just because cloud images are labeled "Cloud" doesn't mean it can only be used in the cloud -- the qemu images can be used on private clouds, etc. 17:41:20 Workstation edition... 17:41:28 ... nothing in Power, built but not tested 17:41:47 ... same with aarch64 -- built but with very limited testing 17:44:04 Still waiting for next-gen hardware to be more widely available. 17:45:09 From the ARMv7 side, the nVidia Tegra stuff is almost all upstream now, and the final bits of kernel support are scheduled for 4.3 17:45:46 General user space 17:45:55 18k-odd packages in Fedora user space 17:46:04 The vast majority of packages are done 17:46:09 Power has mono 17:46:14 aarch64 doesn't yet have mono 17:46:18 But we're in pretty good shape 17:47:23 #topic Kernel 17:47:28 On the Power side of things, very boring 17:47:38 Some interesting features with regards to advanced power hardware 17:47:49 The best way to find out about those features are on the open power website 17:47:52 Feature enablement on 4.2 17:48:09 By the time F23 comes out, things like transactional memory should be back 17:48:35 On the Aarch64 side, the kernel that F22 shipped with made it so we could drop a huge ARM patch 17:48:58 The only patch we now care is for an out of stream pre-production NIC that will never go upstream 17:49:09 With the 4.1 kernel we upstream landed the ACPI patch (massive) 17:49:31 Not enabled by default, as it reduces functionality on the Mustang board 17:49:46 More patches coming into 4.3/4.4 17:49:56 By 4.4, it should be functionaly equivalent to DeviceTree 17:50:08 The ability to use a pure upstream kernel makes Peter's life much better :-) 17:50:43 Was a huge stem in the evolution of Aarch64 support 17:50:48 #topic Bootloaders 17:51:14 In the Aarch64 space, UEFI based... one of the issues is that we don't have the ability to redistribute a completely open UEFI bootloader 17:52:16 Jon Masters says to "watch this space" over the next little while 17:53:13 On the Power side, a standard open boot loader (OPAL?) 17:53:18 ARMv7 obviously we have u-boot 17:53:22 it's evolving quite nicely 17:53:40 We have something like 130-odd board that we should be able to use with u-boot and extlinux 17:53:46 Still being polished 17:53:54 Still have a few issues with consoles, etc. 17:53:56 Most of it is upstream 17:54:06 We still carry three or four patches that we're working with upstream on 17:54:46 It's take a lot of time and effort since the F13 timeframe to get to where we are with u-boot 17:55:01 #topic Anaconda installer 17:55:13 Anaconda works on ARM 17:55:27 You can PXE boot from u-boot and do a completely interactive or kickstart install, just like x86 17:55:35 The same is the case on ARM 64 and Power 17:55:53 While the lack of optical media are somewhat different, the functionality in anaconda is basically the same and works 17:57:27 Still a few issues if you have to write u-boot to the SD card, etc. 17:58:11 On aarch64 with UEFI for the mustang/seattle boards, optical media or USB boot just works :-) 17:58:28 #topic Questions... 17:59:21 When can we use Koji for secondary arches and get rid of koji-shadow 18:00:57 Peter goes into much too much detail on how he's like to see this, but the transcriber can't keep up :-p 18:01:12 No exact roadmap on how or when that might happen, but it's being discussed 18:22:06 #endmeeting