OpenVZ-legacy
  1. OpenVZ-legacy

linux-2.6.20-openvz

Public
AuthorCommitMessageCommit dateIssues
OpenVZ teamPavelOpenVZ team
7fb6904faa4linux-2.6.20-ovz005 released
Alexandr AndreevPavel EmelianovAlexandr Andreev
4e6b27576b0[SCHED] Reduce starvation of some VCPUs in case of cpu limitsChange logic of choosing best_vcpu to schedule to. There are two potential problems: a) if a vcpu is hot, and last used physical CPU of this vcpu is equal to smp_processor_id() it will be always chosen. This is not a good decision, because there is no guarantee, that _all_ physical CPU's must take vcpu's from a vsched. For example, if cpulimit for a vsched is small, this vsched can be run only...
Alexandr AndreevPavel EmelianovAlexandr Andreev
60d4fc03c08[SCHED] find_idle_vcpu() mask check fixIn find_idle_vcpu() we skip VCPU's with ID's that is not set in physical '*cpus' mask. It's incorrect. We must skip VCPU's that has appropriate VCPU->last_pcpu
Alexey KuznetsovPavel EmelianovAlexey Kuznetsov
abe1582e6b9[CPT] checkpointing robust listsOtherwise we are going to have problems with migration of newer glibcs using robust lists when this is possible.
Alexey KuznetsovPavel EmelianovAlexey Kuznetsov
b676b557c90[CPT] alternative way to migrate zombie processesIn older 2.6.8 kernels do_exit() was very simple, essentially it disposed m etc, which is done automatically while checkpointing, and did some work on notifying parent. So that it was natural to move restored process to zombie state by hands. In 2.6.18 do_exit makes _lots_ of work. Seems, it is easier to invert logic. We introduce new flag PF_RESTART_EXIT, which suppresses the work which was ...
Pavel EmelianovPavel Emelianov
78d95858492[CPT] Fix lockdep warning on socket dumpCPT locks all the sockets it finds for dumping. This is OK, but lockdep thinks as if it were a circular locking. It happens each time we migrate a VE with more than one socked aboard.
Vasily TarasovPavel EmelianovVasily Tarasov
0fb7b42a294[PATCH] kconfig: security depends on !veMany people have CONFIG_SECURITY enabled in their configs. When they try to do `make oldconfig` for OpenVZ kernels with such configs, no questions appear concerning CONFIG_VE and friends, and people have OpenVZ kernels with virtualization features disabled. Fix it. Reverse the dependency of VE/SECURITY.
Pavel EmelianovPavel Emelianov
ebba4d4778e[IOACCT] Fix ioacct racesWhen page becomes dirty there's no time to store a context on it - page may become clean immediately. Thus we had a race in accounting when a page became clean before we set a context on it and this context got lost and not freed. Handle the context the other way - in case we're going to set a new context on a page that already has one - free it and account written bytes in case the page beca...
Pavel EmelianovPavel Emelianov
26fd0654761[IOACCT] Debug on page releaseWhen releasing an IO beancounter from the page that is not supposed to IO pb print a warning.
Alexey DobriyanPavel EmelianovAlexey Dobriyan
39f3e4faa35[BC] UB put on error path in fork()If fork() fails after ub_task_charge(), nobody is putting three beancounters getted there.
Alexey DobriyanPavel EmelianovAlexey Dobriyan
26d2416d9d2[BC] uncharge fs root (/) from dcachesize"/" dentry was charged in d_alloc_root(), then charged and uncharged during filesystem activity. But at umount time that first charge was forgotten. So uncharge "/" by hand.
Pavel EmelianovPavel Emelianov
ad4d59a0c82[BC] Percpu counters discrepancy on 64bit archesOperation long += -(unsigned int); leads to wrong result on 64bit due to no sign extension.
Konstantin KhorenkoPavel EmelianovKonstantin Khorenko
1a98ff3dcd5Fix a livelock in stop machineA possible situation in stop_machine: - stopmachine_state == STOPMACHINE_WAIT; - STOPPER (stop_machine()) is in state SM_STOPPER_WAITING, calling yield() in a loop; - SLAVES (stopmachine()) also call yield() in a loop. This leads to the fairsched_lock suffering on all CPUs and in case of unfair getting lock rules (for example on NUMA node), some CPUs can wait for the lock forever/for a long ...
Evgeny KravtsunovPavel EmelianovEvgeny Kravtsunov
a0771080e6f[BRIDGE] Unaligned access on IA64 when compare ether addrPatch fixes unaligned access that takes place on ia64 in compare_ether_addr() compare_ether_addr() requires address to be aligned on 2-byte boundary, while addresses declared in bridges are aligned on 1-byte.
Vasily TarasovPavel EmelianovVasily Tarasov
255968d0558[IOPRIO] cleaning active beancounterAfter beancounter disappears, it still can be active. Clean it up.
Denis V. LunevPavel EmelianovDenis V. Lunev
d69c518c043[VENET] stop IP management before freeing venetThe device is freed before the VE<->IP mapping is cleaned.
Pavel EmelianovPavel Emelianov
57a997d6a3b[NETFILTER] Alow iptables work in 32bit VEs on 64bit machinesA silly msitake caused all VE's requests to receive -EPERM.
Pavel EmelianovPavel Emelianov
3f064957a63Send signals to groups using global pids, not virtual.The problem is that __kill_pgrp_info() is called sometimes with global pid, sometimes with local one. Since we do not have arithmetic split of pids in 2.6.20 all the callers must pass the pids of one type. This was wrong.
Alexandr AndreevPavel EmelianovAlexandr Andreev
0f2fad545e1[SCHED] VCPU should be initialized completely before deletion There is a race in vsched_del_vcpu() - we can kill migration_thread() even if it has not started yet, i.e. migration_thread() function is not called at all. So, migrate_live_tasks() and migrate_dead_tasks() will not be called on this vcpu while migration thread is killed. But there can be some tasks, that have already migrated on thi...This bug can be easily reproduced. On a busy host with many running tasks user can run: In this case, after the second vzctl, migration thread on VCPU 2 will be created and just waked up, but it can be not really started (scheduled) yet if there are a lot of other more priority tasks running on the host. If it will not be scheduled before the third vzctl call, there will be kernel bug in vsche...
Alexandr AndreevPavel EmelianovAlexandr Andreev
9b758cd113c[SCHED] find_busiest_group() should use pcpu maskVCPUs should be skipped according to pcpu mask
Alexandr AndreevPavel EmelianovAlexandr Andreev
535be4f594d[SCHED] Fix for cpu_of()In new scheme, i.e. when physical cpu mask is used whenever it's possible (in find_busiest_vsched(), find_busiest_queue() and so on) cpu_of() must also return physical cpu id for given vcpu. We have to use virtual id's (vcpu->id) only for vsched maps and for process cpus allowed mask. In all other cases we need to use physical masks to account physical CPU's topology.
Alexandr AndreevPavel EmelianovAlexandr Andreev
de939d37e50[SCHED] Cleanup: use vcpu_last_pcpu macro instead of vcpu->last_pcpuReplace vcpu->last_pcpu by vcpu_last_pcpu(vcpu), to fix compilation without CONFIG_VSCHED_VCPU
Alexandr AndreevPavel EmelianovAlexandr Andreev
93db2629f01[SCHED] small cleanup of codeRemove unnecessary argument this_pcpu (=== smp_processor_id()) from find_idle_target() and find_busiest_vsched()
Alexandr AndreevPavel EmelianovAlexandr Andreev
237b70cd323[SCHED] Improve vcpu scheduling taking into account cache hotness In original OVZ kernel schedule_vcpu() takes next VCPU from vsched->active list, and it doesn't take in to account vcpu->last_pcpu, so VCPU's can jump from PCPU to PCPU too often.Try to skip 'hot' VCPU's, i.e. VCPU's that were running on some other PCPU recently. Time slice threshold is tunable via /proc/sys/kernel/vcpu_hot_timeslice
Alexandr AndreevPavel EmelianovAlexandr Andreev
337726a8edb[SCHED] find_busiest_queue() should select VCPUs from given vsched onlyIn new scheme, we choose vsched in find_busiest_vsched(), i.e. before find_busiest_queue(), so when we look for busiest queue we must consider this vsched VCPU's only.
Alexandr AndreevPavel EmelianovAlexandr Andreev
acc98e53058[SCHED] remove debug hunk from previous balance patch My previous patch for load_balance() contains wrong condition statement, that I forget to remove after debugging.load_balance() will not pull tasks from a busiest VCPU's, if there are < 2 tasks running on current VCPU. Attached patch removes this incorrect check and fixes the problem.
Kirill KorotaevPavel EmelianovKirill Korotaev
4c3dfbf42ff[PATCH] Compilation fix fo idlebalance
Alexandr AndreevPavel EmelianovAlexandr Andreev
4eeb956ef32[SCHED] Improve idle load balanceIdle balance is called from an idle thread on rebalance_tick(). load_balance() tries to find busiest group in idle_vsched, where there are no really running tasks. With this patch, load_balance() will try to find a busiest vsched first, and in case of success, then find busiest group inside this vsched, and so on...
Andrey MirkinPavel EmelianovAndrey Mirkin
4bf11416ab8[CPT] Fix IPv6 addresses restoreAll IPv6 addresses based on MAC are created with valid lifetime 0. We checkpoint them and try to restore, but fail as inet6_addr_add() returns -EINVAL if valid_lft is zero. We can use ifaddr flags to find correct values for prefered and valid life times. TODO: Kernel creates automatically local ipv6 address based on MAC address on it when interface is upped. We can manually remove this addre...
Andrey MirkinPavel EmelianovAndrey Mirkin
54170a488ad[CPT] unlimit dcachesize on restoreRecently we have added adjusting of 3 limits on restore to not fail because of hitting limits. Now we have to add another one - dcachesize.
Denis V. LunevPavel EmelianovDenis V. Lunev
9aeff86fc2f[NFS] fix lockd context when bind mounted from VE0 to VE
Konstantin KhorenkoPavel EmelianovKonstantin Khorenko
9e27d0ad506[PROC] mainstream: race between proc_lookup() and sys_delete_module()Fix for the race between proc_lookup() and sys_delete_module(): proc_lookup() can find PDE under proc_subdir_lock, on 2nd CPU sys_delete_module() removes pde and module, then first CPU tries to get de and module in proc_get_inode()... Bum...
Alexandr AndreevPavel EmelianovAlexandr Andreev
78c679a27a4[VESTATS] use jiffies instead of cycles for mm statsThis implementation if very simple but it's strictly not that accurate, because we can add 10 000 000 (or more) cycles (it's ~ 1 jiffy) even if actual allocation consumes < 10 000 cycles, but jiffy has been changed at the moment.
Pavel EmelianovPavel Emelianov
b8617c8296d[LOCKDEP] Another fix for virtualized filesystems lockdepAs described before, filesystems in our kernels are no longer static objects and thus lockdep refuses to work. This was (wrongly) fixed by setting one static class for all super block's semaphores and locks. It turned out that different filesystems use different lock ordering for sb locks and some other ones, e.g. UDF may take inode->i_mutex under sb->s_lock, while ext3 takes sb->s_lock under ...
Vasily TarasovPavel EmelianovVasily Tarasov
760d938c43e[IOPRIO] elevator switch oops fixWhen elevator switch happens and UBs persist, putting of async cfqq can happen second time due to non-NULL value in array.
Vasily TarasovPavel EmelianovVasily Tarasov
83fda956c79[BC] vmguar_enough_memory() oopses if called form kernel threadIf vmguar_enough_memory() function is called by kernel thread, it oopses due to task_struct->mm equals NULL. Such situation was encountered when aufs was over ramfs.
Alexey DobriyanPavel EmelianovAlexey Dobriyan
f3871ef8075[BC] refcount leak in dup_mm() on error pathFix simple beancounter refcount leak on error path in dup_mm().
Andrey MirkinPavel EmelianovAndrey Mirkin
bdae7bcface[BC] Fix potential beancounter refcount leakOn some error paths we forget to put beancounter. This patch fixes two such places: - sys_setluid() - bc_entry_open()
Dmitriy MonakhovPavel EmelianovDmitriy Monakhov
a2a4d6a76ff[EXT3] "ext[34]: EA block reference count racing fix" performance fix From: Andrew Morton <akpm@linux-foundation.org>A little mistake in 8a2bfdcbfa441d8b0e5cb9c9a7f45f77f80da465 is making all transactions synchronous, which reduces ext3 performance to comical levels. Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kirill KorotaevPavel EmelianovKirill Korotaev
686ab8afd87[NMI] set default NMI watchdog timeout to 30 secsIncrease default NMI watchdog timeout to 30 seconds as it was in 2.6.9
Alexey DobriyanPavel EmelianovAlexey Dobriyan
64626d3df91[PATCH] mainstream: fix sys_accept() error path* d_alloc() in sock_attach_fd() fails leaving ->f_dentry NULL * bail out to out_fd label, which does fput()/__fput() on new file * but __fput() assumes valid ->f_dentry
Pavel EmelianovPavel Emelianov
f3ffb2c91b6[BC] Check correct user_beancounter passed first in ub_page_unchargeIf page accidentally has a not-removed page_beancounter kernel will oops dereferencing ub->ub_percpu(). Move the BUG_ON upper to be sure we work with user_beancounter.
Pavel EmelianovPavel Emelianov
33c88831f75[BC] Remove redundant ub == NULL check from free_ub()All callers always pass non-NULL pointer.
Pavel EmelianovPavel Emelianov
a9b3da91c61[BC] Don't make pre-created INDEX_AC and INDEX_L3 caches UBCThis made size-32 and size-64 caches on i386 be the same capacity as size-X(UBC) ones.
OpenVZ teamPavelOpenVZ team
63694d20699linux-2.6.20-ovz004 released
Vasily TarasovPavel EmelianovVasily Tarasov
686c66c5382[IOPRIO] Fix cfqq index calculation in async caseField ioprio of task_struct consits of two numbers: 1) value of class (bits 14-16), 2) value of data (bits 0-13). Value of data is allowed to belong the range [0, 7]. In current implementation of cfq_set_request tsk->ioprio is used as index of *async_cfqq[8] array. It is wrong because tsk->ioprio can be >> 8. This can cause to either corruption or reading insufficient value: cfq_set_request...
Pavel EmelianovPavel Emelianov
d1109bdc15cFixed a misprint in kernel/sched.c after conflicts resolve :(
OpenVZ teamPavel EmelianovOpenVZ team
86714dd6ab1MMerge /linux/kernel/git/stable/linux-2.6.20.y
Vasily TarasovPavel EmelianovVasily Tarasov
0d243af4d66[UBIOPRIO] new cfq queue putting mechanismIt's better to use original cfqq put function from CFQ, when rewrite it. But we need somehow to export this function: use elevator_ops structure for it.
Pavel EmelianovPavel Emelianov
8307bf82e73Cleanup init_dev() to make OVZ patch smaller.