OpenVZ-legacy
  1. OpenVZ-legacy

linux-2.6.32-openvz

Public
AuthorCommitMessageCommit dateIssues
Pavel EmelyanovPavel Emelyanov
c05f95fcb04OpenVZ kernel 2.6.32-avdeyev releasedNamed after Sergei Vasilyevich Avdeyev - a Russian cosmonaut. Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Pavel EmelyanovPavel Emelyanov
b4a419d9abdMMerged linux-2.6.32.12Conflicts: Makefile Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
455792e7712ipv6: fix sysctl unregistering ordercall addrconf_ifdown for loopback at last last ipv6 addr delete with how=0 to fix sysctl tables undergister ordering: all other interfaces attach their sysctl paths to lo's, so unregister lo sysctl tables only at namespace destroy. https://bugzilla.sw.ru/show_bug.cgi?id=473430 Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
fa86dba2b62ve: fix ve task state percpu countersCounters overlap detection for ve tasks in running/uninterraprible/iowait state was broken due to type mismatch: nr_{running/unin..e/iowait}_ve() uses _long_ for summing _int_ percpu counters. As result, it broke ve loadavg calculation after first int overlap. This patch expand all this percpu counters to unsigned long. http://bugzilla.openvz.org/show_bug.cgi?id=1396 Signed-off-by: Konstant...
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
b484e22d951check flags on parsed structurehttp://bugzilla.openvz.org/show_bug.cgi?id=1464 Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
d8a86ef5a6cCPT: check signal curr_target at restoreset signal curr_target to current if right task was not found. fix oops after broken restore. "curr_target" controls round robin signal target balance over process threads, there no reasons to care about migration accuracy. http://bugzilla.openvz.org/show_bug.cgi?id=1467 Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Pavel EmelyanovPavel Emelyanov
61845b781dbcpt: Don't mind the tsk->splice_pipe cache at cpt timeThis field is just a cache for sendfile systemcall. It can be dropped safely during migration - the first sendfile after restore will create it back. http://bugzilla.openvz.org/show_bug.cgi?id=881 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
PeterPavel EmelyanovPeter
fcd86ff706bFix /proc/kmsg permissions with capabilities activeWhenever application sets cap_sys_admin=ep it is unable to read /proc/kmsg with EPERM. This patch makes /proc/kmsg readable on HN. http://bugzilla.openvz.org/show_bug.cgi?id=1360 Signed-off-by: Peter Volkov <pva@gentoo.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
8c6af363b89quota: fix compilation 32-bit compat quota, remove size checks.Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
26aeb82fc7ex86: fix compilation for 32-bit kernelSigned-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
92875e3c49aCPT: update image version to CPT_VERSION_27_3sync cpt minor version with rhel5 branch Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
f7dd75ba9deCPT: ignore deleted linked chr blk fifo nodesIgnore unlinked but referenced pipes, character and block device nodes. Restore process will create it itself. Bug #455855 Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Pavel EmelianovPavel EmelyanovPavel Emelianov
d7c68b19182CPT: Dump fake hardlinks on inotify watch's inodesWhen a watch is attached to unlinked and closed file it will not be restored, since the inode will not be in image. To fix this the proposal is to create a fake link on the inode in a temp dir and dump it. Bug #454944 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Vitaliy GusevPavel EmelyanovVitaliy Gusev
7cf74bdd35dCPT: Open hardlinked files only if is set 'hardlinked_on'Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Vitaliy GusevPavel EmelyanovVitaliy Gusev
52c2eb6da3fCPT: Add ioctl CPT_HARDLNK_ON for rstvzctl have to call ioctl CPT_HARDLNK_ON to enable open hardlinked files by kernel during restore. This protection is needed to prevent mix new kernel + old vzctl (which doesn't do cleaning). In other words, prevent creating/open files which will not be removed, and therefore this issue can lead to security problem. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Signed-off-by: Pavel Emelyano...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
72dfa44429cCPT: Add CPT_DENTRY_HARDLINKED flag to cpt_file_imageThis flag tells that file was hardlinked. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Vitaliy GusevPavel EmelyanovVitaliy Gusev
80d2ce353aaCPT: Create hard links to "deleted but referenced" during checkpointFor "deleted but referenced" files, kernel creates hard link in directory (that was set via CPT_LINKDIR_ADD) in format: .cpt_hardlink.xxxxxxxx x - digit, from 0 to 9 Note - this policy is used only when no other ways of dumping unlined file helped. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Vitaliy GusevPavel EmelyanovVitaliy Gusev
c24ab545f53CPT: Add ioctl CPT_LINKDIR_ADD for cptvzctl have to call ioctl CPT_LINKDIR_ADD to tell kernel where create hardlinked files during checkpoint. Without this ioctl kernel assumes that creating hardlinked files is off. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhorenkoPavel EmelyanovKonstantin Khorenko
d4ef97ff644CPT: stop the migration if shm restoration failedBug #268163 Signed-off-by: Konstantin Khorenko <khorenko@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Marat StanichenkoPavel EmelyanovMarat Stanichenko
089c01a6503CPT: restart local_kernel_thread in case of -ERESTARTNOINTRThis is essential in case of migration to SLM node. We can bump into situation when SLM refuses to fork during the undumping process because it thinks that subgroup's resources are to be redistributed. When this happens fork is delayed with the -ERESTARTNOINTR error and the undumping process fails. As Den (den@) noticed userspace is not intented to see the -ERESTARTNOINTR error so we should h...
Andrey MirkinPavel EmelyanovAndrey Mirkin
8551a850a45CPT: save/restore only classic task flagsTask flags were restored as they were saved in image. That is not correct as flags are differs in 2.6.9, 2.6.16 and 2.6.18 kernels. Actually we just need to save/restore only classic flags (PF_EXITING, PF_DEAD, PF_FORKNOEXEC, PF_SUPERPRIV, PF_DUMPCORE and PF_SIGNALED). The problems can occure because during migration from 2.6.9 to 2.6.18 kernel flag PF_USED_MATH was not restored on tsk->flags ...
Andrey MirkinPavel EmelyanovAndrey Mirkin
75f2abfa9f9CPT: udp sockets restore fixSome applications (like ntpd) set on udp sockets sk_reuse to 1. So any other applications can bind to the same port. During restore we must skip this check and restore and bind all sockets. On IPv6 we must also force DAD (Duplicate Address Detection) procedure to be sure that IFA_F_TENTATIVE flag will be cleared on IPv6 address and socket can be binded to it. http://bugzilla.openvz.org/show_bu...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
ba94d3fa2bbCPT: screw up udev bindmounts knotUbuntu's udev on boot does: if ! mountpoint -q /dev; then # initramfs didn't mount /dev, so we'll need to do that mount -n --bind /dev /etc/udev mount -n -t tmpfs -o mode=0755 udev /dev mkdir -m 0700 -p /dev/.static/dev mount -n --move /etc/udev /dev/.static/dev fi So, workaround is dumping "/dev" as bindmount's sourc...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
faa9a6dd94cCPT: restore dead tasks proc filesIf some process opened /proc/<pid><somefile> and process with <pid> will die after some time then checkpoint fails with error: Can not dump VE: Invalid argument Error: d_path cannot be looked up /proc/125/cmdline The fix is to catch this situation at the dump time, mark the image respectively and restore a fake file on restore. http://bugzilla.openvz.org/show_bug.cgi?id=1047 Sig...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
977418edceaCPT: adjust vfsmounts restore orderIdea is: Dump parent before dump his children This order is needed during checkpoint/restore: mount /A /B -o bind mount none /C -t tmpfs mkdir /C/D mount /B /C/D --move After this, checkpoint (w/o this patch) will dump vfsmounts in order: - vfsmount, bind to /A, mounted to /C/D - vfsmount, mounted to /C (tmpfs) and will restore in the same order, that cause...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
c42b985195cCPT: dont cpt requiresdev fsDon't allow chkpnt VE with mounted ext2/ext3, etc filesystems. Allow checkpoint only for mounted nodev and "external" filesystem. This check protects from error on restore: CPT ERR: ffff810007113000,102 :-2 mounting /root/some_dir ext3 40000000 as do_one_mount() doesn't pass mntdev to mount(). [xemul: actually, the reason we don't support filesystems other than virtual and tmpfs ...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
a1d028ce2f1CPT: Restore information about tcp listening socketsNot all options are important. Only missed ipv6only can cause error if other application want to listen the same port for IPv4 any address. tp->XXX are inherited by children (noticed by Alexey Kuznetsov), so we need also to restore these options. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Comment from Alexey: It [everything before] was not OK. The feature which are broken are important...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
6364b5498e4CPT: put 'expect' after insert to the 'conntrack'During restore conntrack, we need to put expect after allocating ip_conntrack_expect and do something with one. Expect will be freed or immediate (if nobody has this expect) or during cleanup/timer hooks. Otherwise expect never will be freed. Note: Approaches for kernels 2.6.18 and 2.6.9 are different. For example see help() in "net/ipv4/netfilter/ip_conntrack_netbios_ns.c" Signed-off-by: Vi...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
b3d4348ca63CPT: Fix ip_conntrack_ftp usage counter leakFunction ip_conntrack_helper_find_get() gets module counter. So put a conntrack after putting in the hash and handling the conntrack's expect list. Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
74e373eeb5eCPT: dump and restore global snmp statisticsPer device exists for ipv6 only and is probably not used now, but anyway - I'll do it later. This patch adds new section CPT_SECT_SNMP_STATS that is populated with CPT_OBJ_BITS set of objects - one for each type of statistics. Objects have variable length. Stats are stored as a plain array of __u32 numbers and thus the order in which stats types are stored is implicitly hard-coded. In case we...
Vitaliy GusevPavel EmelyanovVitaliy Gusev
3b0f4b2e050CPT: Fix memory corruption if cpt_family is wrong.During restore, if parent socket is AF_INET but cpt_family is wrong (non initialized, see bug ##95113), then consider request as related to AF_INET6 is not right and leads to memory corruption. As there are a lot of buggy images, so we can't check only on values AF_INET and AF_INET6. Desicion: - Check request on AF_INET6 first, and consider request as AF_INET by default. - Additionally c...
Pavel EmelianovPavel EmelyanovPavel Emelianov
4a7ddd3db9aCPT: fix restoring of /dev/null opened early by initThe problem is the following: * init from fc9 starts and opens /dev/null for its stdin, stdout and stderr * udev starts and overmounts /dev with tmpfs After this cpt cannot dump this ve, since one process holds a file, that is inaccessible from ve root. The proposed solution is the following: 1. allow for /dev/null to be over-mounted 2. restore init's file in two stages: stage1: *before*...
Pavel EmelianovPavel EmelyanovPavel Emelianov
937a5462e54CPT: lock sock before restoring its synwait queueThis new socket already has all the necessary TCP timers armed, so tcp_keepalive_timer can fire during the rst_restore_synwait_queue and (for the latter being lockless) can spoil the queue. Bug #118912 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Konstantin KhlebnikovPavel EmelyanovKonstantin Khlebnikov
c5d30bd0194CPT: sysctl randomize_va_spaceimplement checkpointing for virtualized sysctl kernel.randomize_va_space. reuse existing unused pad1 field in cpt_veinfo_image. 0 -> image without rnd_va_space virtualization (default value is used) 1 -> rnd = 0 2 -> rnd = 1 etc... Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
bbdcbaadf79CPT: add check for presence of module slm_dmprst if SLM is enabledAdd a check in "checks" for presence of module slm_dmprst if SLM is enabled. Check will be performed for both source and destination nodes. Changes in vzmigrate are not needed. Bug #114312 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
04c139f6c20CPT: add diagnostics in case of iptables-restore failIt is not clear right now what is wrong if iptables-restore fails. Add some diagnostics in case of error. Bug #95952 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Den LunevPavel EmelyanovDen Lunev
f06677625bfCPT: Check that VE is not running on restore.Bug #99679 Signed-off-by: Denis V. Lunev <den@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
dcda9404300CPT: fix check in decode_tuple()Tuple structure can be used as a mask and protonum can be 0xffff in 2.6.9 kernel. In 2.6.18 kernel all masks for protonum are 0xff and 0xffff will be shrunken to 0xff. Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
5a889e32263CPT: fix restore of conntrack expect timerOne more fix of restore conntrack procedure. Following code: if (ct->helper->timeout && !del_timer(&exp->timeout)) { ... } can lead to oops, as exp->timeout is not initialized at this point. Actually this optimization is not needed at all. If expectation is dying, then we will let it die by its own death. Also in ip_conntrack_expect_insert() there is an initialization of exp->timeout. And ...
Andrey MirkinPavel EmelyanovAndrey Mirkin
19dce010fafCPT: restore mark value on conntracksRestore mark value in conntracks as it is needed for connmark module. Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
7ec63fdedf3CPT: convert conntrack tuple from 2.6.9 kernel imageAdd conversion for conntrack tuple from 2.6.9 kernel image. Check for correct value is added in decode_tuple(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
c34d6367f6cCPT: convert conntrack image from 2.6.9 to 2.6.18CPT structure in image file for conntracks is different in 2.6.9 and 2.6.18 kernels (array cpt_help_data was enlarged in the middle of the structure), so conntracks from 2.6.9 kernel are restored incorrectly on 2.6.18 kernel and lead to kernel oops. A simple conversion from 2.6.9 to 2.6.18 is introduced to restore conntracks correctly on 2.6.18 kernel. Bug #113290 Signed-off-by: Pavel Emelya...
Andrey MirkinPavel EmelyanovAndrey Mirkin
21644501b46CPT: create kernel threads in VE0 contextIn current implementation master process which performs checkpointing has owner_env set to VE0 and exec_env set to VE. All auxiliary kernel threads are created with exec_env set to VE and owner_env set to VE0, so after the do_fork_pid() we have the follwing: * new thread has owner_env == ve0, exec env == ve * its pid belongs to ve (pid->veid != 0) That is why if ve_enter() in thread fails, ...
Andrey MirkinPavel EmelyanovAndrey Mirkin
686bb3916a1CPT: restore rlimits correctly during 32bit-64bit migrationDuring 32bit to 64bit migration rlimits were restored incorrectly due to different size of long on 32bit and 64bit archs. Now simple conversion is introduced in case of 32bit-64bit migration. Infinity values are restored as infinity values. Error is returned if value greater than RLIM_INFINITY32 is found in dump during restore on 32bit arch. Bug #111965 Signed-off-by: Pavel Emelyanov <xemul@o...
Andrey MirkinPavel EmelyanovAndrey Mirkin
c3e4a29b420CPT: restore packet control block from kernels with and without IPv6More generic mechanism for restoring packet control blocks. Unfortunately we do not save length of control block in dump and we can only try to calculate it during restore. This method is based on knowledge that the flags value in TCP control block is not zero for all packets in queue. Since this image version TCP control block will be saved in IPv6 form regardless to IPv6 config option. Restor...
Andrey MirkinPavel EmelyanovAndrey Mirkin
1f218bb8d60CPT: add binfmt_misc fs in supported listJust add binfmt_misc in list of supported file systems. With this small quick fix migration will be allowed, but all binfmt_misc entries will be dropped during migration. This fix is only for the first time. Later will be implemented generic mechanism for checkpointing/restore of external modules. And this quick fix will be replaced with full support for binfmt_misc in CPT. Bugs #100709, #101...
Andrey MirkinPavel EmelyanovAndrey Mirkin
85da0ddab18CPT: relax check for several bind mounts on the same mount pointRelax check for special bind mounts which mounted several times on the same mount point. We need to check only dentry, mount check can be skipped in this case. We can't remove completely mount check as there are exist cases when we need to check mnt too. E.g. /dev is mounted with NODEV over /dev and some file is opened from underlying mount. If mount check is removed, then we will be able to ch...
Andrey MirkinPavel EmelyanovAndrey Mirkin
bc4769bb4acCPT: fix reopen dentries procedureDentries were not reopened correctly during checkpointing and restore. Two bugs fixed: 1. In case of huge files (more then 2Gb) dentry_open() returns -EFBIG if O_LARGEFILE flag is not set. This flag should be used for temporary files used during checkpointing and restore process. Bug #99544 https://bugzilla.sw.ru/show_bug.cgi?id=99544 2. In dump_content_regular() we have following co...
Andrey MirkinPavel EmelyanovAndrey Mirkin
08b8f8ba476CPT: fix save/restore of open requestsOpen requests were saved and restored sometimes incorrectly: 1. Family of open request was not saved (commented out) 2. Restore was broken, would crash because rsk_ops was cleared by memset. 3. And finally, all the coded restoring open requests was skipped. Tested with http_load. Bug #95113 http://bugzilla.openvz.org/show_bug.cgi?id=784 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Andrey MirkinPavel EmelyanovAndrey Mirkin
0a6789976c6cpt: add lost dcache_lock protection around __d_path()Protect __d_path() call with dcache_lock spinlock. Protect other checks with env->op_sem semaphore. Bug #98833 Signed-off-by: Pavel Emelyanov <xemul@openvz.org>