OpenVZ-legacy
  1. OpenVZ-legacy

linux-2.6.27-openvz

Public
AuthorCommitMessageCommit dateIssues
Pavel EmelyanovPavel Emelyanov
a8e6d74c128OpenVZ kernel 2.6.27-kiprensky releasedCalled after Orest Adamovich Kiprensky - a leading Russian portraitist in the Age of Romanticism
Konstantin KhlebnikovKonstantin Khlebnikov
873e9cd2630quota: fix compilation 32-bit compat quota, remove size checks.Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovKonstantin Khlebnikov
1e94f945a7cx86: fix compilation for 32-bit kernelSigned-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Pavel EmelyanovPavel Emelyanov
6c1f64ffad9MMerged Merged 2.6.27.45Conflicts: Makefile include/net/netfilter/ipv6/nf_conntrack_ipv6.h net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c net/ipv6/netfilter/nf_conntrack_reasm.c
Konstantin KhlebnikovKonstantin Khlebnikov
491fd14dbf2CPT: update image version to CPT_VERSION_27_3sync cpt minor version with rhel5 branch Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Konstantin KhlebnikovKonstantin Khlebnikov
75f08417b3aCPT: ignore deleted linked chr blk fifo nodesIgnore unlinked but referenced pipes, character and block device nodes. Restore process will create it itself. Bug #455855 Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Pavel EmelianovPavel Emelianov
e0c4ee5359fCPT: Dump fake hardlinks on inotify watch's inodesWhen a watch is attached to unlinked and closed file it will not be restored, since the inode will not be in image. To fix this the proposal is to create a fake link on the inode in a temp dir and dump it. Bug #454944
Vitaliy GusevVitaliy Gusev
6ba6e951f80CPT: Open hardlinked files only if is set 'hardlinked_on'Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Vitaliy GusevVitaliy Gusev
9d20138ff37CPT: Add ioctl CPT_HARDLNK_ON for rstvzctl have to call ioctl CPT_HARDLNK_ON to enable open hardlinked files by kernel during restore. This protection is needed to prevent mix new kernel + old vzctl (which doesn't do cleaning). In other words, prevent creating/open files which will not be removed, and therefore this issue can lead to security problem. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Vitaliy GusevVitaliy Gusev
489c482ce5aCPT: Add CPT_DENTRY_HARDLINKED flag to cpt_file_imageThis flag tells that file was hardlinked. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Vitaliy GusevVitaliy Gusev
77c9b06810aCPT: Create hard links to "deleted but referenced" during checkpointFor "deleted but referenced" files, kernel creates hard link in directory (that was set via CPT_LINKDIR_ADD) in format: .cpt_hardlink.xxxxxxxx x - digit, from 0 to 9 Note - this policy is used only when no other ways of dumping unlined file helped. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Vitaliy GusevVitaliy Gusev
a863dd33895CPT: Add ioctl CPT_LINKDIR_ADD for cptvzctl have to call ioctl CPT_LINKDIR_ADD to tell kernel where create hardlinked files during checkpoint. Without this ioctl kernel assumes that creating hardlinked files is off. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Konstantin KhorenkoKonstantin Khorenko
d470b275a58CPT: stop the migration if shm restoration failedBug #268163 Signed-off-by: Konstantin Khorenko <khorenko@openvz.org>
Marat StanichenkoMarat Stanichenko
1667e6ee73eCPT: restart local_kernel_thread in case of -ERESTARTNOINTRThis is essential in case of migration to SLM node. We can bump into situation when SLM refuses to fork during the undumping process because it thinks that subgroup's resources are to be redistributed. When this happens fork is delayed with the -ERESTARTNOINTR error and the undumping process fails. As Den (den@) noticed userspace is not intented to see the -ERESTARTNOINTR error so we should h...
Andrey MirkinAndrey Mirkin
66d2ccac941CPT: save/restore only classic task flagsTask flags were restored as they were saved in image. That is not correct as flags are differs in 2.6.9, 2.6.16 and 2.6.18 kernels. Actually we just need to save/restore only classic flags (PF_EXITING, PF_DEAD, PF_FORKNOEXEC, PF_SUPERPRIV, PF_DUMPCORE and PF_SIGNALED). The problems can occure because during migration from 2.6.9 to 2.6.18 kernel flag PF_USED_MATH was not restored on tsk->flags ...
Andrey MirkinAndrey Mirkin
1b9a4f0a254CPT: udp sockets restore fixSome applications (like ntpd) set on udp sockets sk_reuse to 1. So any other applications can bind to the same port. During restore we must skip this check and restore and bind all sockets. On IPv6 we must also force DAD (Duplicate Address Detection) procedure to be sure that IFA_F_TENTATIVE flag will be cleared on IPv6 address and socket can be binded to it. http://bugzilla.openvz.org/show_bu...
Vitaliy GusevVitaliy Gusev
ca00b57a505CPT: screw up udev bindmounts knotUbuntu's udev on boot does: if ! mountpoint -q /dev; then # initramfs didn't mount /dev, so we'll need to do that mount -n --bind /dev /etc/udev mount -n -t tmpfs -o mode=0755 udev /dev mkdir -m 0700 -p /dev/.static/dev mount -n --move /etc/udev /dev/.static/dev fi So, workaround is dumping "/dev" as bindmount's sourc...
Vitaliy GusevVitaliy Gusev
d625e87fedcCPT: restore dead tasks proc filesIf some process opened /proc/<pid><somefile> and process with <pid> will die after some time then checkpoint fails with error: Can not dump VE: Invalid argument Error: d_path cannot be looked up /proc/125/cmdline The fix is to catch this situation at the dump time, mark the image respectively and restore a fake file on restore. http://bugzilla.openvz.org/show_bug.cgi?id=1047 Sig...
Vitaliy GusevVitaliy Gusev
3d906db0bffCPT: adjust vfsmounts restore orderIdea is: Dump parent before dump his children This order is needed during checkpoint/restore: mount /A /B -o bind mount none /C -t tmpfs mkdir /C/D mount /B /C/D --move After this, checkpoint (w/o this patch) will dump vfsmounts in order: - vfsmount, bind to /A, mounted to /C/D - vfsmount, mounted to /C (tmpfs) and will restore in the same order, that cause...
Vitaliy GusevVitaliy Gusev
1742796186eCPT: dont cpt requiresdev fsDon't allow chkpnt VE with mounted ext2/ext3, etc filesystems. Allow checkpoint only for mounted nodev and "external" filesystem. This check protects from error on restore: CPT ERR: ffff810007113000,102 :-2 mounting /root/some_dir ext3 40000000 as do_one_mount() doesn't pass mntdev to mount(). [xemul: actually, the reason we don't support filesystems other than virtual and tmpfs ...
Vitaliy GusevVitaliy Gusev
872b2f24d86CPT: Restore information about tcp listening socketsNot all options are important. Only missed ipv6only can cause error if other application want to listen the same port for IPv4 any address. tp->XXX are inherited by children (noticed by Alexey Kuznetsov), so we need also to restore these options. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Comment from Alexey: It [everything before] was not OK. The feature which are broken are important...
Vitaliy GusevVitaliy Gusev
92b2dc068f8CPT: put 'expect' after insert to the 'conntrack'During restore conntrack, we need to put expect after allocating ip_conntrack_expect and do something with one. Expect will be freed or immediate (if nobody has this expect) or during cleanup/timer hooks. Otherwise expect never will be freed. Note: Approaches for kernels 2.6.18 and 2.6.9 are different. For example see help() in "net/ipv4/netfilter/ip_conntrack_netbios_ns.c" Signed-off-by: Vi...
Vitaliy GusevVitaliy Gusev
d3890ed905aCPT: Fix ip_conntrack_ftp usage counter leakFunction ip_conntrack_helper_find_get() gets module counter. So put a conntrack after putting in the hash and handling the conntrack's expect list. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Konstantin KhlebnikovKonstantin Khlebnikov
07c486490c3CPT: dump and restore global snmp statisticsPer device exists for ipv6 only and is probably not used now, but anyway - I'll do it later. This patch adds new section CPT_SECT_SNMP_STATS that is populated with CPT_OBJ_BITS set of objects - one for each type of statistics. Objects have variable length. Stats are stored as a plain array of __u32 numbers and thus the order in which stats types are stored is implicitly hard-coded. In case we...
Vitaliy GusevVitaliy Gusev
571b94274a3CPT: Fix memory corruption if cpt_family is wrong.During restore, if parent socket is AF_INET but cpt_family is wrong (non initialized, see bug ##95113), then consider request as related to AF_INET6 is not right and leads to memory corruption. As there are a lot of buggy images, so we can't check only on values AF_INET and AF_INET6. Desicion: - Check request on AF_INET6 first, and consider request as AF_INET by default. - Additionally c...
Pavel EmelianovPavel Emelianov
2fd0b536d95CPT: fix restoring of /dev/null opened early by initThe problem is the following: * init from fc9 starts and opens /dev/null for its stdin, stdout and stderr * udev starts and overmounts /dev with tmpfs After this cpt cannot dump this ve, since one process holds a file, that is inaccessible from ve root. The proposed solution is the following: 1. allow for /dev/null to be over-mounted 2. restore init's file in two stages: stage1: *before*...
Pavel EmelianovPavel Emelianov
a118239231cCPT: lock sock before restoring its synwait queueThis new socket already has all the necessary TCP timers armed, so tcp_keepalive_timer can fire during the rst_restore_synwait_queue and (for the latter being lockless) can spoil the queue. Bug #118912
Konstantin KhlebnikovKonstantin Khlebnikov
4a4d0f4ecd7CPT: sysctl randomize_va_spaceimplement checkpointing for virtualized sysctl kernel.randomize_va_space. reuse existing unused pad1 field in cpt_veinfo_image. 0 -> image without rnd_va_space virtualization (default value is used) 1 -> rnd = 0 2 -> rnd = 1 etc... Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Andrey MirkinAndrey Mirkin
ea388aec938CPT: add check for presence of module slm_dmprst if SLM is enabledAdd a check in "checks" for presence of module slm_dmprst if SLM is enabled. Check will be performed for both source and destination nodes. Changes in vzmigrate are not needed. Bug #114312
Andrey MirkinAndrey Mirkin
2310d0a2b0cCPT: add diagnostics in case of iptables-restore failIt is not clear right now what is wrong if iptables-restore fails. Add some diagnostics in case of error. Bug #95952
Denis V. LunevDenis V. Lunev
c45d8e0b2daCPT: Check that VE is not running on restore.Bug #99679 Signed-off-by: Denis V. Lunev <den@parallels.com>
Andrey MirkinAndrey Mirkin
8b145eb8a8dCPT: fix check in decode_tuple()Tuple structure can be used as a mask and protonum can be 0xffff in 2.6.9 kernel. In 2.6.18 kernel all masks for protonum are 0xff and 0xffff will be shrunken to 0xff.
Andrey MirkinAndrey Mirkin
976e4d059fcCPT: fix restore of conntrack expect timerOne more fix of restore conntrack procedure. Following code: if (ct->helper->timeout && !del_timer(&exp->timeout)) { ... } can lead to oops, as exp->timeout is not initialized at this point. Actually this optimization is not needed at all. If expectation is dying, then we will let it die by its own death. Also in ip_conntrack_expect_insert() there is an initialization of exp->timeout. And ...
Andrey MirkinAndrey Mirkin
62c7a4bab55CPT: restore mark value on conntracksRestore mark value in conntracks as it is needed for connmark module.
Andrey MirkinAndrey Mirkin
7cf54c679c0CPT: convert conntrack tuple from 2.6.9 kernel imageAdd conversion for conntrack tuple from 2.6.9 kernel image. Check for correct value is added in decode_tuple().
Andrey MirkinAndrey Mirkin
2ff9f3cc080CPT: convert conntrack image from 2.6.9 to 2.6.18CPT structure in image file for conntracks is different in 2.6.9 and 2.6.18 kernels (array cpt_help_data was enlarged in the middle of the structure), so conntracks from 2.6.9 kernel are restored incorrectly on 2.6.18 kernel and lead to kernel oops. A simple conversion from 2.6.9 to 2.6.18 is introduced to restore conntracks correctly on 2.6.18 kernel. Bug #113290
Andrey MirkinAndrey Mirkin
91e78b8011eCPT: create kernel threads in VE0 contextIn current implementation master process which performs checkpointing has owner_env set to VE0 and exec_env set to VE. All auxiliary kernel threads are created with exec_env set to VE and owner_env set to VE0, so after the do_fork_pid() we have the follwing: * new thread has owner_env == ve0, exec env == ve * its pid belongs to ve (pid->veid != 0) That is why if ve_enter() in thread fails, ...
Andrey MirkinAndrey Mirkin
51b4fc81711CPT: restore rlimits correctly during 32bit-64bit migrationDuring 32bit to 64bit migration rlimits were restored incorrectly due to different size of long on 32bit and 64bit archs. Now simple conversion is introduced in case of 32bit-64bit migration. Infinity values are restored as infinity values. Error is returned if value greater than RLIM_INFINITY32 is found in dump during restore on 32bit arch. Bug #111965
Andrey MirkinAndrey Mirkin
a7128ebfcc9CPT: restore packet control block from kernels with and without IPv6More generic mechanism for restoring packet control blocks. Unfortunately we do not save length of control block in dump and we can only try to calculate it during restore. This method is based on knowledge that the flags value in TCP control block is not zero for all packets in queue. Since this image version TCP control block will be saved in IPv6 form regardless to IPv6 config option. Restor...
Andrey MirkinAndrey Mirkin
a070ca88639CPT: add binfmt_misc fs in supported listJust add binfmt_misc in list of supported file systems. With this small quick fix migration will be allowed, but all binfmt_misc entries will be dropped during migration. This fix is only for the first time. Later will be implemented generic mechanism for checkpointing/restore of external modules. And this quick fix will be replaced with full support for binfmt_misc in CPT. Bugs #100709, #101061
Andrey MirkinAndrey Mirkin
7a3cd47ac5eCPT: relax check for several bind mounts on the same mount pointRelax check for special bind mounts which mounted several times on the same mount point. We need to check only dentry, mount check can be skipped in this case. We can't remove completely mount check as there are exist cases when we need to check mnt too. E.g. /dev is mounted with NODEV over /dev and some file is opened from underlying mount. If mount check is removed, then we will be able to ch...
Andrey MirkinAndrey Mirkin
a44bf01a820CPT: fix reopen dentries procedureDentries were not reopened correctly during checkpointing and restore. Two bugs fixed: 1. In case of huge files (more then 2Gb) dentry_open() returns -EFBIG if O_LARGEFILE flag is not set. This flag should be used for temporary files used during checkpointing and restore process. Bug #99544 https://bugzilla.sw.ru/show_bug.cgi?id=99544 2. In dump_content_regular() we have following co...
Andrey MirkinAndrey Mirkin
f6239592e1bCPT: fix save/restore of open requestsOpen requests were saved and restored sometimes incorrectly: 1. Family of open request was not saved (commented out) 2. Restore was broken, would crash because rsk_ops was cleared by memset. 3. And finally, all the coded restoring open requests was skipped. Tested with http_load. Bug #95113 http://bugzilla.openvz.org/show_bug.cgi?id=784
Andrey MirkinAndrey Mirkin
b0cea3cc482cpt: add lost dcache_lock protection around __d_path()Protect __d_path() call with dcache_lock spinlock. Protect other checks with env->op_sem semaphore. Bug #98833
Andrey MirkinAndrey Mirkin
3f97fd2e9d3cpt: fix restore of inotify on symlinkInside VE file /etc/mtab is a symlink to /proc/mounts. FreeNX server with KDE creates inotify on /etc/mtab file. To restore such inotify we need to obtain dentry with path_lookup() and restore inotify on it. Bug #96464
Konstantin KhlebnikovKonstantin Khlebnikov
f40dd4a1c98quota: compat layer for compat quotaThis patch implements compatibility quotactls for old quota tools. replace: diff-fs-quotcompat-ia32emul-fix-20050921 diff-fs-quotcompat-comp-fix-20080710 diff-fs-quotcompat-xencomp-fix-20080806 diff-fs-quota-compat-proper-split-20081027 Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Pavel EmelianovPavel Emelianov
8f0518c1e0eve: Don't check for CAP_SETVEID - use more ... imaginationThis patch: The proposed check correctly detects the root in ve0. However, we lose the ability to create containers with some fancy tool, that has the CAP_SETVEID capability *only*, but we don't have such. The cap itself is declared to be obsoleted, but there's no need in rewriting vzctl in a rush - things will still work. If we'll want to manipulate audit caps from the ...
Pavel EmelianovPavel Emelianov
8e864dea614fairsched: Sanitize fairsched manipulations on ve startupFirst of all we won't be able to call them after we fix capability checks. Second of it is that taking the fairsched mutex 4 times on startup is an overkill.
Konstantin KhlebnikovKonstantin Khlebnikov
698944dc711ms: lutime lchmod syscallsAdd possibility to change owner/permissions on symbolic links Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Konstantin KhorenkoKonstantin Khorenko
bd37c86c4b8ve-net: permit changing of netdev's tx_queue_len from inside a CTIn particular it makes OpenVPN happy. Bug #457318 Signed-off-by: Konstantin Khorenko <khorenko@openvz.org>