Intro#

TOOR 팀 활동을 하며 분석하게된 도커 컨테이너 탈출과 관련된 원데이 취약점에 관한 글입니다.

이번에 알아볼 취약점은 2024년 1월 31일에 공개된 도커 컨테이너의 Low-Level 컨테이너 런타임인 runc와 관련된 취약점입니다.

https://miro.medium.com/v2/resize:fit:828/format:webp/1*CZD4P0OpVML_vsO7RNRevA.png

도커 컨테이너의 런타임 중 Low-Level에 해당하는 `runc`의 1.1.11 이전 버전에서 발생한 취약점으로, runc가 컨테이너를 생성하는 과정중 적절하게 처리하지 않아 노출된 파일 디스크립터로 인해 발생하는 취약점 입니다. 이로 인해 사용자는 호스트 운영체제의 파일 시스템에 접근할 수 있게되고 이를 이용해 도커 컨테이너에서 탈출하여 호스트에 접근까지 가능할 수 있습니다.

본 글은 선행 연구를 진행하신 다른 연구원분들의 글들을 읽고 제 나름 분석을 진행하며 취약점을 공부하며 이해하고 정리해본 결과로 작성하게된 글입니다. 나름의 분석을 해봤지만 맞지 않는 부분이 있을 수 있으며, 만약 이를 발견하셨을 시 피드백해주시면 적극 반영하도록 하겠습니다. 취약점 및 PoC 분석에 많은 도움이된 자료는 다음과 같습니다.

Vuln#

CVE-ID : CVE-2024-21626
CWE-: CWE-668(Exposure of Resource to Wrong Sphere), CWE-403(Exposure of File Descriptor to Unintended Control Sphere (‘File Descriptor Leak’))

RCA#

해당 취약점은 runc에서 컨테이너를 생성하는 과정중 열린 /sys/fs/cgroup을 적절하게 닫지 않고 컨테이너를 생성하는 행위 및 Current Working Directory의 검증 미흡에 의해서 발생합니다. 이로인해 공격자는 생성된 컨테이너 내에서 /proc/self/fd/<파일디스크립터>의 형태로 호스트 운영체제의 파일 시스템에 접근할 수 있게됩니다. 취약점으로 인해 호스트 파일 시스템의 /sys/fs/cgroup이 /proc/self/fd/7에 매핑될 경우(이외에도 내부 메커니즘에 의해서 할당되는 파일 디스크립터 패턴이 존재합니다. 즉, 반드시 7에 매핑되어야만 하는것은 아닙니다.) 도커파일의 지시어인 WORKDIR를 다음과 같이 설정하게되면 컨테이너내에서 호스트의 파일 시스템에 접근이 가능합니다.

1
WORKDIR /proc/self/fd/7

위와 같은 명령어로 호스트 파일 시스템 네임스페이스 내에서 현재 작업 디렉터리를 갖게됩니다. 즉, 공격자는 컨테이너 외부의 호스트 운영체제의 자원이 보이는 상황입니다.

만약 컨테이너내의 사용자의 UID가 0일 경우 이렇게 탈출된 파일 시스템에서 ssh키를 추가하는 형태나 파일을 조작 및 삽입하는 형태의 공격을 시도할 수 있습니다.

공격자는 다음과 같은 페이로드로 호스트 내부의 파일에 접근할 수 있게됩니다.

1
cat /proc/1/cwd/../../../../../../../../../../../../../etc/passwd

이 취약점은 도커 명령어로 컨테이너 내에서 다른 컨테이너를 생성할 수 있을 경우에도 비슷하게 악용할 수 있습니다.(공격 원리는 같습니다.) /proc/self/fd/7와 같이 닫히지 않은 호스트 파일 디스크립터를 심볼릭 링크를 통해 컨테이너내의 디렉터리와 매핑시켜준 후, 또 다른 컨테이너를 실행시킬때 -w(cwd 지정)을 통해 이를 매핑시켜 호스트의 파일 시스템에 접근할 수 있습니다.

Patch#

다음과 같이 4개의 부분이 추가되었습니다.

작업 디렉터리 검증(8e1cd2f)#

취약점의 악용은 Current Working Directory 변경을 통해 진행됩니다. 이를 방지하는 cwd를 검증하는 코드가 추가되었습니다.

1
// verifyCwd ensures that the current directory is actually inside the mount
2
// namespace root of the current process.
3
func verifyCwd() error {
4
   // getcwd(2) on Linux detects if cwd is outside of the rootfs of the
5
  // current mount namespace root, and in that case prefixes "(unreachable)"
6
  // to the returned string. glibc's getcwd(3) and Go's Getwd() both detect
7
  // when this happens and return ENOENT rather than returning a non-absolute
8
  // path. In both cases we can therefore easily detect if we have an invalid
9
  // cwd by checking the return value of getcwd(3). See getcwd(3) for more
10
  // details, and CVE-2024-21626 for the security issue that motivated this
11
  // check.
12
  //
13
  // We have to use unix.Getwd() here because os.Getwd() has a workaround for
14
  // $PWD which involves doing stat(.), which can fail if the current
15
  // directory is inaccessible to the container process.
16
  if wd, err := unix.Getwd(); errors.Is(err, unix.ENOENT) {
17
    return errors.New("current working directory is outside of container mount namespace root -- possible container breakout detected")
18
  } else if err != nil {
19
    return fmt.Errorf("failed to verify if current working directory is safe: %w", err)
20
  } else if !filepath.IsAbs(wd) {
21
    // We shouldn't ever hit this, but check just in case.
22
    return fmt.Errorf("current working directory is not absolute -- possible container breakout detected: cwd is %q", wd)
23
  }
24
  return nil
25
}
26
...
27
  // Make sure our final working directory is inside the container.
28
  if err := verifyCwd(); err != nil {
29
    return err
30
  }

컨테이너 생성(execve)전 열린 파일 디스크립터 닫기 작업(f2f1621)#

취약점은 컨테이너 생성 이전에 열린 파일 디스크립터로 인해 발생합니다. 이와 같은 문제를 방지하기 위해 컨테이너를 생성하기 전에 열린 파일 디스크립터를 닫는 작업을 하는 코드가 추가되었습니다.

1
// CloseExecFrom sets the O_CLOEXEC flag on all file descriptors greater or
2
// equal to minFd in the current process.
3
func CloseExecFrom(minFd int) error {
4
  // Use close_range(CLOSE_RANGE_CLOEXEC) if possible.
5
  if haveCloseRangeCloexec() {
6
    err := unix.CloseRange(uint(minFd), math.MaxUint, unix.CLOSE_RANGE_CLOEXEC)
7
    return os.NewSyscallError("close_range", err)
8
  }
9
  // Otherwise, fall back to the standard loop.
10
  return fdRangeFrom(minFd, unix.CloseOnExec)
11
}
12
//go:linkname runtime_IsPollDescriptor internal/poll.IsPollDescriptor
13
// In order to make sure we do not close the internal epoll descriptors the Go
14
// runtime uses, we need to ensure that we skip descriptors that match
15
// "internal/poll".IsPollDescriptor. Yes, this is a Go runtime internal thing,
16
// unfortunately there's no other way to be sure we're only keeping the file
17
// descriptors the Go runtime needs. Hopefully nothing blows up doing this...
18
func runtime_IsPollDescriptor(fd uintptr) bool //nolint:revive
19
// UnsafeCloseFrom closes all file descriptors greater or equal to minFd in the
20
// current process, except for those critical to Go's runtime (such as the
21
// netpoll management descriptors).
22
//
23
// NOTE: That this function is incredibly dangerous to use in most Go code, as
24
// closing file descriptors from underneath *os.File handles can lead to very
25
// bad behaviour (the closed file descriptor can be re-used and then any
26
// *os.File operations would apply to the wrong file). This function is only
27
// intended to be called from the last stage of runc init.
28
func UnsafeCloseFrom(minFd int) error {
29
  // We cannot use close_range(2) even if it is available, because we must
30
  // not close some file descriptors.
31
  return fdRangeFrom(minFd, func(fd int) {
32
    if runtime_IsPollDescriptor(uintptr(fd)) {
33
      // These are the Go runtimes internal netpoll file descriptors.
34
      // These file descriptors are operated on deep in the Go scheduler,
35
      // and closing those files from underneath Go can result in panics.
36
      // There is no issue with keeping them because they are not
37
      // executable and are not useful to an attacker anyway. Also we
38
      // don't have any choice.
39
      return
40
    }
41
    if logs.IsLogrusFd(uintptr(fd)) {
42
      // Do not close the logrus output fd. We cannot exec a pipe, and
43
      // the contents are quite limited (very little attacker control,
44
      // JSON-encoded) making shellcode attacks unlikely.
45
      return
46
    }
47
    // There's nothing we can do about errors from close(2), and the
48
    // only likely error to be seen is EBADF which indicates the fd was
49
    // already closed (in which case, we got what we wanted).
50
    _ = unix.Close(fd)
51
  })
52
}

`/sys/fs/cgroup` 핸들 누수 방지(89c93dd)#

cgroupRootHandle을 추가하고 설정 중 오류가 발생하여 범위 밖으로 빠져나오는 경우 에러를 발생시켜 가비지 컬렉터에 의한 자동 처리가 이루어지도록 수정되었습니다.

1
  cgroupRootHandle *os.File
2
...
3

4
  if err != nil {
5
    err = &os.PathError{Op: "openat2", Path: path, Err: err}
6
    // Check if cgroupFd is still opened to cgroupfsDir
7
    // Check if cgroupRootHandle is still opened to cgroupfsDir
8
    // (happens when this package is incorrectly used
9
    // across the chroot/pivot_root/mntns boundary, or
10
    // when /sys/fs/cgroup is remounted).
11
    //
12
    // TODO: if such usage will ever be common, amend this
13
    // to reopen cgroupFd and retry openat2.
14
    fdPath, closer := utils.ProcThreadSelf("fd/" + strconv.Itoa(cgroupFd))
15
    // to reopen cgroupRootHandle and retry openat2.
16
    fdPath, closer := utils.ProcThreadSelf("fd/" + strconv.Itoa(int(cgroupRootHandle.Fd())))
17
    defer closer()
18
    fdDest, _ := os.Readlink(fdPath)
19
    if fdDest != cgroupfsDir {
20
      // Wrap the error so it is clear that cgroupFd
21
      // Wrap the error so it is clear that cgroupRootHandle
22
      // is opened to an unexpected/wrong directory.
23
      err = fmt.Errorf("cgroupFd %d unexpectedly opened to %s != %s: %w",
24
        cgroupFd, fdDest, cgroupfsDir, err)
25
      err = fmt.Errorf("cgroupRootHandle %d unexpectedly opened to %s != %s: %w",
26
        cgroupRootHandle.Fd(), fdDest, cgroupfsDir, err)
27
    }
28
    return nil, err
29
  }

`runc init` 실행 전 모든 비표준 입/출력 파일 디스크립터 `O_CLOEXEC` 플래그 설정(ee73091)#

파일 디스크립터 유출을 방지하기위해 runc init 이전에 열린 파일 비표준 파일 디스크립터에 대해서 O_CLOEXEC 플래그를 설정합니다.

해당 플래그는 파일을 열고나서 fork나 exec 계열의 시스템 콜 함수 호출시 자동으로 파일을 닫도록합니다.

1
  // Before starting "runc init", mark all non-stdio open files as O_CLOEXEC
2
  // to make sure we don't leak any files into "runc init". Any files to be
3
  // passed to "runc init" through ExtraFiles will get dup2'd by the Go
4
  // runtime and thus their O_CLOEXEC flag will be cleared. This is some
5
  // additional protection against attacks like CVE-2024-21626, by making
6
  // sure we never leak files to "runc init" we didn't intend to.
7
  if err := utils.CloseExecFrom(3); err != nil {
8
    return fmt.Errorf("unable to mark non-stdio fds as cloexec: %w", err)
9
  }