Backend Development
C++
an elegant way to fix user IDs in docker containers using docker_userid_fixer
an elegant way to fix user IDs in docker containers using docker_userid_fixer

what is it about?
It's about a rather technical issue in using docker containers that interact with the docker host computer, generally related to using the host filesystem inside the container.
That happens in particular in reproducible research context.
I developed an opensource utility that helps tackling that issue.
docker containers as execution environments
The initial and main use case of a docker container: a self-contained application that only interacts with the host system with some network ports.
Think of a web application: the docker container typically contains a web server and a web application, running for example on port 80 (inside the container). The container is then run on the host, by binding the container internal port 80 to a host port (e.g. 8000).
Then the only interaction between the containerized app and the host system is via this bound network port.
Containers as execution environments are completely different:
- instead of containerizing an application, it's the application build system that is containerized.
- it could a be a compiler, an IDE, a notebook engine, a Quarto publishing system...
- the goals are:
- to have an standard, easy to install and share environment
- imagine a complex build environment, with fixed versions of R, python and zillions of external packages. Installing everything with the right versions can be a very difficult and time-consuming task. By sharing a docker image containing everything already installed and pre-configured is a real time-saver.
- to have a reproducible environment
- by using it, you are able to reproduce some analysis results, since you are using very same controlled environment
- you can also easily reproduce bugs, which is the first step to fixing them
- to have an standard, easy to install and share environment
But, in order to use those execution environments, those containers must have access to the host system, in particular to the host user filesystem.
docker containers and the host filesystem
Suppose you have containerized an IDE, e.g. Rstudio.
Your Rstudio is installed and running inside the docker container, but it needs to read and edit files in your project folder.
For that you bind mount your project folder (in your host filesystem) using the docker run --volume option.
Then your files are accessible from withing the docker container.
The challenge now are the file permissions. Suppose your host user has userid 1001, and suppose that the user owning the Rsudio process in the container is either 0 (root), or 1002.
If the container user is root, then it will have no issue in reading your files.
But as soon as you edit some existing files, are produce new ones (e.g. pdf, html), these files will belong to root also on the host filesystem!.
Meaning that your local host user will not be able to use them, or delete them, since they belong to root.
Now if the container user id is 1002, Rstudio may not be able to read your files, edit them or produce new files.
Even if it can, by settings some very permissive permissions, your local host user may not be able to use them.
Of course one bruteforce way of solving that issue is to run with root both on the host computer and withing the docker container. This is not always possible and raise some obvious critical security concerns.
solving the file owner issue part 1: the docker run --user option
Because we can not know in advance what will be the host userid (here 1001), we can not pre-configure
the userid of the docker container user.
docker run now provides a --user option that enables to create a pseudo user with some supplied userid
at runtime. For example, docker run --user 1001 ... will create a docker container running with processes
belonging to a user with userid 1001.
So what are we still discussing this issue? Isn't it solved?
Here some quirks about that dynamically created user:
- 擬似ユーザーです
- ホームディレクトリ (/home/xxx) がありません
- /etc/passwd には表示されません
- 事前に設定することはできません。 bash プロファイル、いくつかの環境変数、アプリケーションのデフォルトなどを使用します...
これらの問題を回避することはできますが、退屈でイライラする可能性があります。
私たちが本当に望んでいるのは、Docker コンテナ ユーザーを事前設定し、
実行時... でその userid を動的に変更できるようにすることです。
docker_userid_fixer は、先ほど提起した userid の問題を修正するための
docker エントリポイント として使用することを目的としたオープン ソース ユーティリティです。 使用方法を見てみましょう: これを docker ENTRYPOINT として設定し、使用するユーザーを指定し、その
userid を動的に変更します:
リーリー
用語を正確に言いましょう:
target
- ユーザーは、docker_userid_fixer にリクエストされたユーザーです。ここでは
- user1 リクエストされたユーザーは、docker runによってプロビジョニングされたユーザー、つまり最初のプロセス(PID 1)を(当初)所有するユーザーです
- その後、コンテナー ランタイムの作成時に、次の 2 つのオプションがあります:
ユーザーIDが(すでに)
target- ユーザーIDと一致する場合、何も変更する必要はありません
- あるいはそうではありません。たとえば、requestedのユーザーIDは1001 、
- targetのユーザーIDは100です。 次に、docker_userid_fixer は、コンテナーのメインプロセスで直接、ターゲットユーザーuser1のユーザーIDを1000から1001に修正します。 実際にはこれで問題は解決します: コンテナのユーザーIDを修正する必要がない場合は、通常の方法でdocker runを使用してください(--userオプションなし)
-
docker_userid_fixer のセットアップ
- セットアップに関する手順はここでご覧いただけます。
- しかし、要約すると次のようになります:
小さな実行可能ファイル (17k) をビルドまたはダウンロードします
- Docker イメージにコピーします
-
-
短いメモをいくつか入れました https://github.com/kforner/docker_userid_fixer#how-it-works
でも、言い換えてみます
setuid root
です。 ユーザー ID を変更するには root 権限が必要であり、この setuid は に対してのみその特権実行を有効にします。
docker_userid_fixerプログラム、そしてそれは非常に短期間です。
リクエストされたユーザー (およびユーザー ID) に送信します。
これらのトピック (Docker、再現可能な研究、R パッケージ開発、アルゴリズム、パフォーマンスの最適化、並列処理など) に興味がある場合は、仕事やビジネスの機会について話し合うために、お気軽に私に連絡してください。
The above is the detailed content of an elegant way to fix user IDs in docker containers using docker_userid_fixer. For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undresser.AI Undress
AI-powered app for creating realistic nude photos
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undress AI Tool
Undress images for free
Clothoff.io
AI clothes remover
AI Hentai Generator
Generate AI Hentai for free.
Hot Article
Hot Tools
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
1378
52
C language data structure: data representation and operation of trees and graphs
Apr 04, 2025 am 11:18 AM
C language data structure: The data representation of the tree and graph is a hierarchical data structure consisting of nodes. Each node contains a data element and a pointer to its child nodes. The binary tree is a special type of tree. Each node has at most two child nodes. The data represents structTreeNode{intdata;structTreeNode*left;structTreeNode*right;}; Operation creates a tree traversal tree (predecision, in-order, and later order) search tree insertion node deletes node graph is a collection of data structures, where elements are vertices, and they can be connected together through edges with right or unrighted data representing neighbors.
The truth behind the C language file operation problem
Apr 04, 2025 am 11:24 AM
The truth about file operation problems: file opening failed: insufficient permissions, wrong paths, and file occupied. Data writing failed: the buffer is full, the file is not writable, and the disk space is insufficient. Other FAQs: slow file traversal, incorrect text file encoding, and binary file reading errors.
How do I use rvalue references effectively in C ?
Mar 18, 2025 pm 03:29 PM
Article discusses effective use of rvalue references in C for move semantics, perfect forwarding, and resource management, highlighting best practices and performance improvements.(159 characters)
How do I use ranges in C 20 for more expressive data manipulation?
Mar 17, 2025 pm 12:58 PM
C 20 ranges enhance data manipulation with expressiveness, composability, and efficiency. They simplify complex transformations and integrate into existing codebases for better performance and maintainability.
What are the basic requirements for c language functions
Apr 03, 2025 pm 10:06 PM
C language functions are the basis for code modularization and program building. They consist of declarations (function headers) and definitions (function bodies). C language uses values to pass parameters by default, but external variables can also be modified using address pass. Functions can have or have no return value, and the return value type must be consistent with the declaration. Function naming should be clear and easy to understand, using camel or underscore nomenclature. Follow the single responsibility principle and keep the function simplicity to improve maintainability and readability.
How do I use move semantics in C to improve performance?
Mar 18, 2025 pm 03:27 PM
The article discusses using move semantics in C to enhance performance by avoiding unnecessary copying. It covers implementing move constructors and assignment operators, using std::move, and identifies key scenarios and pitfalls for effective appl
How does dynamic dispatch work in C and how does it affect performance?
Mar 17, 2025 pm 01:08 PM
The article discusses dynamic dispatch in C , its performance costs, and optimization strategies. It highlights scenarios where dynamic dispatch impacts performance and compares it with static dispatch, emphasizing trade-offs between performance and
How to calculate c-subscript 3 subscript 5 c-subscript 3 subscript 5 algorithm tutorial
Apr 03, 2025 pm 10:33 PM
The calculation of C35 is essentially combinatorial mathematics, representing the number of combinations selected from 3 of 5 elements. The calculation formula is C53 = 5! / (3! * 2!), which can be directly calculated by loops to improve efficiency and avoid overflow. In addition, understanding the nature of combinations and mastering efficient calculation methods is crucial to solving many problems in the fields of probability statistics, cryptography, algorithm design, etc.


