Commits
Author | Commit | Message | Commit date | Issues | |
---|---|---|---|---|---|
Denis Silakov | 2b388db4cb9 | AUTO Version bump to 3.12.5.13 | |||
Pavel Tikhomirov | c5630c6cff5 | fix mount-v2: temporary mount internal yards to mntns trees | |||
Pavel Tikhomirov | d9fa626d214 | zdtm: add new pidns_proc testThis test creates nested pid namespace and mounts proc for it. After c/r we check that pid namespaces of our proc mount and /proc differes. On old criu without patches for nested pidns proc support they will be the same. Also we create file/dir children mounts in our proc similar to what we see in docker container, create one extra tmpfs bind child and create overmount to check cornercases. ... | |||
Pavel Tikhomirov | b9c0c07526c | mount-v2: delayed nested pidns owned proc mountingTemporary remove procfs mounts owned by nested pidns from the mount tree on the first stage, we have helpers for them to restore on second stage after forking all tasks. On second stage we first mount a helper for each procfs which is visible from mntns in internal yard. Second we run a tree walk for each delayed (not yet mounted) mount in the mntns and bind-mount it into the tree either from ... | |||
Pavel Tikhomirov | 3e055bfbc25 | mount-v2: add resolve_mnt_fd helperThis one is doing almost the same as resolve_mnt_path_fd, but it is a bit more tricky. The resolve_mnt_path_fd opens any path for the mount tree where we know mp_fd_id and mnt_fd_id for each mount, but resolve_mnt_fd wants to open mnt_fd_id for currently mounted mount, and thus we surely don't have it already =). We wan't to find root dentry of curently mounted mount M. If our parent P mount h... | |||
Pavel Tikhomirov | 688f0bee91c | mount-v2: split out __resolve_mnt_path_fd helperSplit out the part of resolve_mnt_path_fd which finds the pair of proper mp_fd_id or mnt_fd_id and relative path offset into separate helper. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 5922a88ab1a | mount-v2: add resolve_fd_path helperThe meaning of this function is to get an fd to a dentry at some path from the given mount. It uses mnt_fd_id and mp_fd_id-s of already mounted mounts and thus handles any type of overmounts which can hide the path from normal filesystem view. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 0b1256942e7 | mount-v2: create helpers for ancestors of nested pidns procsPut helpers into internal yards copy all required staff for mounting them from prototype mount and add to mnt_bind lists. To make life easier we set root to "/" this makes all helper mounts directory mount and non-deleted. For now we don't support the case of external mounts over nested pidns owned procfs with non-"/" root. It is hard to handle because we can't change root to "/" for their he... | |||
Pavel Tikhomirov | ea9bbd4ac3c | mount: export mnt_subtree_next helperWill need it in mount-v2.c Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 1d789ee92d3 | mount-v2: dump owner pidns'es of proc mountsIf we have procfs fsroot mount we can resolve pidns from which it was mounted by just reading link of "1/ns/pid" file. On restore copy owner pidns id to all other non-fsroot bindmounts. Also put mounts with nested pidns owner to list. So that now restore knows which proc should be mounted from which pidns, but actual mounting will be in next patches. In case there is no info about pidns owner... | |||
Pavel Tikhomirov | 72b97907884 | mount-v2: temporary mount enabled internal yards to mntns treesTo mount them we simply add them to the tree before general mounting code. We mount internal yard to "/internal-yard-XXXXXX" in the root mount of the mountns. We also umount internal yards in fini_restore_mntns_v2 when we don't need them anymore with all descendants as they should not be visible after restore. We also need to remove temporary directory. We need to enter owner userns of the mn... | |||
Pavel Tikhomirov | 1ff0021c3e9 | mount-v2: prepare to mount internal yard mountsAlso bind-mount hosts proc inside the internal yard. It will be used to be able to look under overmounts. We will add helper mounts as children of internal yard, so prepare to make directories for them. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 7dcbfca78ea | mount-v2: create internal yard mounts for each mntnsWe wan't to split mount restore into two steps: we will mount some part of mounts to the tree in each mount namespace, but the rest of mounts will be mounted later (mounts which depend from pid namespaces can be mounted only after forking all tasks). To be able to mount the rest of mounts we need to somehow handle if their mountpoints in tree are already overmounted. For that we have mp_fd_id ... | |||
Pavel Tikhomirov | ef1915e6c9d | mount: make is_dir int and -1 initializedIf is_dir is pre-setup elsewhere skip is_dir detection in create_plain_mountpoint. Will preset is_dir in next patches. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 30d79bb8216 | mount: move yard, is_overmounted and merging trees to read_mnt_ns_imgMake mount prepatations early so that all tasks can see them. This also removes mount-v1 vs mount-v2 code duplication. While on it root yard is outside of namespace by definition so let it have zero ns_mountpoint, it should not be used anyway. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 70ec2393fd2 | mount: fix children overmount check to exclude root yardRoot yard has no valid ns_mountpoint and by construction has no children overmounts. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 534001c9f44 | mount-v2: restore_mount_sharing_options at the end of forking stageFunction restore_mount_sharing_options should be called from service mount namespace as it needs to have access to external mount source. After forking stage the main criu task is the only one left in service mount namespace. So if we want to delay it to the end of forking stage, we need to call it from the main criu task. Create special helper fini_restore_mntns_v2 for mounts-v2 actions which... | |||
Pavel Tikhomirov | 8602a8d2a4a | cr-restore: move CR_STATE_RESTORE switch to restore_root_taskThe main criu task switches to the CR_STATE_FORKING, but the next stage CR_STATE_RESTORE is special and is switched to by the root task instead. And the main task is waiting for CR_STATE_RESTORE to finish. Change it so that the actual switch is done in the main task, and the root task still waits all other tasks and calls it's stage ending callbacks (ve_itty_resolve and ve_itty_resolve) and gi... | |||
Pavel Tikhomirov | 0109bf94f5b | mount-v2: call resolve_shared_mounts_v2 earier from read_mnt_ns_imgThis way all criu tasks will see shared groups, as we set them up before forking any tasks of restored process tree. Create special helper read_mnt_ns_img_v2 for preparing mount tree just after reading it from images for mounts-v2. Also move search_bindmounts to read_mnt_ns_img as it is required for shared group resolution. So now all criu tasks also see bindmounts resolved. Signed-off-by: P... | |||
Pavel Tikhomirov | 6da8cfebf23 | mount: put mounted to shared memoryWe will delay mounting several mounts in next patches to criu main task after forking processes. So to be able to see which mounts were already mounted in criu root task from the main task we need to put it into shared memory. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | aef1ac84976 | mount: put mp_fd_id and mnt_fd_id to shared memoryWe will delay restore_mount_sharing_options in next patches, and it would be called from the criu task, so we need to share ids on the mountinfo to make criu task see ids setup by the root task. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 32aa3da9f26 | mount: make general place for shared variables on mount-info on restorePut remounted_rw to it. This allows us to easily add some more of such variables without allocating each one of them separately. Due to existance of shfree_last shmalloc'ed region can be inherited from the previous caller so it needs to be explicitly zero initialized. Fixes: 19bd17bd9 ("ghost/mount: allocate remounted_rw in shmem to get info from other processes") Signed-off-by: Pavel Tikhom... | |||
Pavel Tikhomirov | 0022e7e3058 | mount: detect unsupported mntns root overmountWe can't enter a mount namespace and see all mounts if mountns has overmounted root, we get to the overmount as root of our chroot instead. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 15abd1b3b4f | cr-restore: in out_kill_network_unlocked we should not destroy helpersFixes: 31680a217 ("pid: Create pid_ns helpers") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 538e1958146 | util: don't take nested lock for call_in_child_processWe can't call nested call_in_child_process because we try to take last pid lock twice, fix it. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | db93017cacd | mount: fix broken remounted_rw checkExpression (x && REMOUNTED_RW) is always same as just (x). It should've been (x & REMOUNTED_RW) to check if mount is marked as temporary remounted writable and requires to be switched back. By fixing this check we eliminate excess readonly remounts. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 949e339523b | mount-v2: use plain_mountpoint explicitly in set_unbindable_v2The mount-v2 code is already uses plain_mountpoint everywhere exept this place, make it unified. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 632f09ef6ad | mount-v2: add shared group restore debugNow we can see all the process in logs in case of related problems. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Denis Silakov | da0b1afd326 | AUTO Version bump to 3.12.5.12 | |||
Pavel Tikhomirov | c1acfd7a042 | files-reg: fix open_remap_spfs_linked remap has no mnt_idWe've got "Error (criu/files-reg.c:888): The -1 mount is not found for ghost" because create_link_remap dumps RegFileEntry without mnt_id for link-remap, that is because same mnt_id is assumed as for original rfi from which we did the remap. So just remove rmi part to return to old behaviour, we can do it as spfs link remaps are remaped only once, see dump_linked_remap_type. https://jira.sw.r... | PSBM-105661 | ||
Denis Silakov | caf3371a6f8 | AUTO Version bump to 3.12.5.11 | |||
Andrey Zhadchenko | d2ae326b865 | restorer: fix criu fail with lazy-pages and pre c\rIn case of pre-dump and lazy-migration criu disables THP for some time. All vmas created during that time are getting VMA_NOHUGEPAGE flag. If madvice is called with VMA_NOHUGEPAGE atop of already existing VMA_NOHUGEPAGE it will return -EINVAL. This case should not lead to restoration failure. RHEL8 kernel seems to have that fixed. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com> | |||
Andrey Zhadchenko | 57d316a4ab2 | jenkins: exclude ns_file_bindmount and thp_disable from testsSince ns_file_bindmount works only with mountv2, exclude it from lazy-pages test group. thp_disable test works incorrectly with pre-dump and lazy-pages options in rhel7 kernel. Disable it for a while. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com> | |||
Mike Rapoport | da2b056dec4 | lazy-pages: fix stack detectionThe commit 5432a964dcc7 ("lazy-pages: don't mark current stack page as lazy") tried to make the pages surrounding the stack pointers non-lazy. Unfortunately, it used a wrong mask for the detection. Fix it. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrei Vagin <avagin@gmail.com> (cherry picked from commit 1bc68dd873cbc0bbeba0e365a3ababe120d2ef2e) Signed-off-by: Andrey ... | |||
Andrey Zhadchenko | 7270c718790 | pstree: fix nested namespaces for lazy-pagesVarious tests with nested namespaces - pidns00, etc., fail because page-server is unable to construct pstree from image "Error (criu/pstree.c:631): ns has no parent". Both checkpoint and restore (page_server_init_send and cr_lazy_pages) routines used in lazy-pages mode should respect nested namespaces when they construct pstree. Make sure that constructing initial pstree with prepare_dummy_pstr... | PSBM-104329 | ||
Denis Silakov | 1c842e34278 | AUTO Version bump to 3.12.5.10 | |||
Michał Cłapiński | 1b74502307e | Add ZDTM tests for child subreaper property1. Basic check if property is migrated 2. Check that property is restored for existing children 3. Check that child subreaper does not affect reparenting Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Michał Cłapiński <mclapinski@google.com> Reviewed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry-picked from commit 6606f246c21a8e1ff40b56087a360c1ff6fbe6bd... | PSBM-104746 | ||
Michał Cłapiński | dd8f830193c | Add support for migrating CHILD_SUBREAPER prctl1. Checkpoint it via parasite. 2. Restore it after forking. Signed-off-by: Michał Cłapiński <mclapinski@google.com> Reviewed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> (cherry-picked from commit db2777e73c3cf44ddb75aa4ab9f6e5bb88705571) https://jira.sw.ru/browse/PSBM-104746 Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com> | PSBM-104746 | ||
Denis Silakov | 7926d43844d | AUTO Version bump to 3.12.5.9 | |||
Pavel Tikhomirov | 56a38837a28 | mount: add one more list validation check to cr_time mount removalWe don't want to corrupt list with this removal. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 31a02062c68 | mount: fix mnt_sharing list not initializedWe've catched uninitialized mnt_sharing list with vzt: (00.177608) 1: Error (criu/mount.c:3926): mnt: BUG at criu/mount.c:3926 (gdb) list 3926 3926 BUG_ON(!list_empty(&cr_time->mnt_sharing)); (gdb) p cr_time->mnt_sharing $1 = {prev = 0x0, next = 0x0} https://jira.sw.ru/browse/PSBM-105464 Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | PSBM-105464 | ||
Denis Silakov | d3a28c3fa41 | AUTO Version bump to 3.12.5.8 | |||
Pavel Tikhomirov | 7ea847ad5c8 | mount-v2: treat mount as file-bindmount if mountpoint is not directoryAll S_IFIFO, S_IFCHR, S_IFBLK, S_IFREG, S_IFLNK and S_IFSOCK are files in terms of mountpoints, so lets treat them correspondingly. With ubuntu-20.04-x86_64 ostemplate CT we have chardev mountpoint: (00.368896) 1: Error (criu/mount-v2.c:515): mnt-v2: Unsupported st_mode 020644 for /tmp/.criu.mntns.EbRInZ/mnt-0000002192/kmsg Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> | |||
Pavel Tikhomirov | 57732944284 | zdtm: fix race in pidns02 test plus small cleanupAgain we need to wait process to become zombie, else we have race between zdtm file persistance check and zimbieing of this process. Also fix error handling and error messages. Reported by Jenkins: https://ci.openvz.org/job/CRIU/job/CRIU-virtuozzo-stable/job/vz7-u15/65/consoleFull ------------------------ ERROR OVER ------------------------ 31415: Old files lost: set([u'1', u'0', u'3', ... | PSBM-104930 | ||
Denis Silakov | 1e25e1c5263 | AUTO Version bump to 3.12.5.7 | |||
Alexander Mikhalitsyn | ef33f289571 | cr-dump: fix vpid corruption on pre-dumpWe should check that vpid is empty on pstree_item in pre_dump_one_task() function and fill it with appropriate data from parasite (see PARASITE_CMD_DUMP_MISC handle). But if vpid is already filled, it means that we already initialized tasks vpid's with corresponding values with respect to nested pidnamespaces and we shouldn't override these values. See also 2fb9d5d8e351c711dd7d60d230aef37c4084... | PSBM-104960 | ||
Denis Silakov | c1168159206 | AUTO Version bump to 3.12.5.6 | |||
Andrey Zhadchenko | c7db31315a6 | mount: adjust log level for get_clean_mntIn case get_clean_mnt fails open_mountpoint is still able to resolve mounts by helper process or print error in the worst case. Using pr_warn instead of pr_perror. https://jira.sw.ru/browse/PSBM-96506 Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com> | PSBM-96506 | ||
Andrey Zhadchenko | d6f30ecdc24 | mount: adjust log level for mnt_is_dirmnt_is_dir is used when looking up for suitable mount point. In some cases that function may fail several times. Error level seems to strict for this cases. Added error message to lookup_mnt_sdev in case all mnt_is_dir failed. As for open_handle and alloc_openable which are calling mnt_is_dir, they are used in check_open_handle, which will call error afterwards. Adjusted log level for __open_m... | PSBM-96506 | ||
Andrey Zhadchenko | c9223677441 | zdtm: add somaxconn testTest if /proc/sys/net/core/somaxconn exists and if it is correctly restored https://jira.sw.ru/browse/PSBM-94854 Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com> | PSBM-94854 |