Author | Commit | Message | Commit date | Issues | |
---|---|---|---|---|---|
OpenVZ team | 7fb6904faa4 | linux-2.6.20-ovz005 released | |||
Alexandr Andreev | 4e6b27576b0 | [SCHED] Reduce starvation of some VCPUs in case of cpu limitsChange logic of choosing best_vcpu to schedule to. There are two potential problems: a) if a vcpu is hot, and last used physical CPU of this vcpu is equal to smp_processor_id() it will be always chosen. This is not a good decision, because there is no guarantee, that _all_ physical CPU's must take vcpu's from a vsched. For example, if cpulimit for a vsched is small, this vsched can be run only... | |||
Alexandr Andreev | 60d4fc03c08 | [SCHED] find_idle_vcpu() mask check fixIn find_idle_vcpu() we skip VCPU's with ID's that is not set in physical '*cpus' mask. It's incorrect. We must skip VCPU's that has appropriate VCPU->last_pcpu | |||
Alexey Kuznetsov | abe1582e6b9 | [CPT] checkpointing robust listsOtherwise we are going to have problems with migration of newer glibcs using robust lists when this is possible. | |||
Alexey Kuznetsov | b676b557c90 | [CPT] alternative way to migrate zombie processesIn older 2.6.8 kernels do_exit() was very simple, essentially it disposed m etc, which is done automatically while checkpointing, and did some work on notifying parent. So that it was natural to move restored process to zombie state by hands. In 2.6.18 do_exit makes _lots_ of work. Seems, it is easier to invert logic. We introduce new flag PF_RESTART_EXIT, which suppresses the work which was ... | |||
Pavel Emelianov | 78d95858492 | [CPT] Fix lockdep warning on socket dumpCPT locks all the sockets it finds for dumping. This is OK, but lockdep thinks as if it were a circular locking. It happens each time we migrate a VE with more than one socked aboard. | |||
Vasily Tarasov | 0fb7b42a294 | [PATCH] kconfig: security depends on !veMany people have CONFIG_SECURITY enabled in their configs. When they try to do `make oldconfig` for OpenVZ kernels with such configs, no questions appear concerning CONFIG_VE and friends, and people have OpenVZ kernels with virtualization features disabled. Fix it. Reverse the dependency of VE/SECURITY. | |||
Pavel Emelianov | ebba4d4778e | [IOACCT] Fix ioacct racesWhen page becomes dirty there's no time to store a context on it - page may become clean immediately. Thus we had a race in accounting when a page became clean before we set a context on it and this context got lost and not freed. Handle the context the other way - in case we're going to set a new context on a page that already has one - free it and account written bytes in case the page beca... | |||
Pavel Emelianov | 26fd0654761 | [IOACCT] Debug on page releaseWhen releasing an IO beancounter from the page that is not supposed to IO pb print a warning. | |||
Alexey Dobriyan | 39f3e4faa35 | [BC] UB put on error path in fork()If fork() fails after ub_task_charge(), nobody is putting three beancounters getted there. | |||
Alexey Dobriyan | 26d2416d9d2 | [BC] uncharge fs root (/) from dcachesize"/" dentry was charged in d_alloc_root(), then charged and uncharged during filesystem activity. But at umount time that first charge was forgotten. So uncharge "/" by hand. | |||
Pavel Emelianov | ad4d59a0c82 | [BC] Percpu counters discrepancy on 64bit archesOperation long += -(unsigned int); leads to wrong result on 64bit due to no sign extension. | |||
Konstantin Khorenko | 1a98ff3dcd5 | Fix a livelock in stop machineA possible situation in stop_machine: - stopmachine_state == STOPMACHINE_WAIT; - STOPPER (stop_machine()) is in state SM_STOPPER_WAITING, calling yield() in a loop; - SLAVES (stopmachine()) also call yield() in a loop. This leads to the fairsched_lock suffering on all CPUs and in case of unfair getting lock rules (for example on NUMA node), some CPUs can wait for the lock forever/for a long ... | |||
Evgeny Kravtsunov | a0771080e6f | [BRIDGE] Unaligned access on IA64 when compare ether addrPatch fixes unaligned access that takes place on ia64 in compare_ether_addr() compare_ether_addr() requires address to be aligned on 2-byte boundary, while addresses declared in bridges are aligned on 1-byte. | |||
Vasily Tarasov | 255968d0558 | [IOPRIO] cleaning active beancounterAfter beancounter disappears, it still can be active. Clean it up. | |||
Denis V. Lunev | d69c518c043 | [VENET] stop IP management before freeing venetThe device is freed before the VE<->IP mapping is cleaned. | |||
Pavel Emelianov | 57a997d6a3b | [NETFILTER] Alow iptables work in 32bit VEs on 64bit machinesA silly msitake caused all VE's requests to receive -EPERM. | |||
Pavel Emelianov | 3f064957a63 | Send signals to groups using global pids, not virtual.The problem is that __kill_pgrp_info() is called sometimes with global pid, sometimes with local one. Since we do not have arithmetic split of pids in 2.6.20 all the callers must pass the pids of one type. This was wrong. | |||
Alexandr Andreev | 0f2fad545e1 | [SCHED] VCPU should be initialized completely before deletion There is a race in vsched_del_vcpu() - we can kill migration_thread() even if it has not started yet, i.e. migration_thread() function is not called at all. So, migrate_live_tasks() and migrate_dead_tasks() will not be called on this vcpu while migration thread is killed. But there can be some tasks, that have already migrated on thi...This bug can be easily reproduced. On a busy host with many running tasks user can run: In this case, after the second vzctl, migration thread on VCPU 2 will be created and just waked up, but it can be not really started (scheduled) yet if there are a lot of other more priority tasks running on the host. If it will not be scheduled before the third vzctl call, there will be kernel bug in vsche... | |||
Alexandr Andreev | 9b758cd113c | [SCHED] find_busiest_group() should use pcpu maskVCPUs should be skipped according to pcpu mask | |||
Alexandr Andreev | 535be4f594d | [SCHED] Fix for cpu_of()In new scheme, i.e. when physical cpu mask is used whenever it's possible (in find_busiest_vsched(), find_busiest_queue() and so on) cpu_of() must also return physical cpu id for given vcpu. We have to use virtual id's (vcpu->id) only for vsched maps and for process cpus allowed mask. In all other cases we need to use physical masks to account physical CPU's topology. | |||
Alexandr Andreev | de939d37e50 | [SCHED] Cleanup: use vcpu_last_pcpu macro instead of vcpu->last_pcpuReplace vcpu->last_pcpu by vcpu_last_pcpu(vcpu), to fix compilation without CONFIG_VSCHED_VCPU | |||
Alexandr Andreev | 93db2629f01 | [SCHED] small cleanup of codeRemove unnecessary argument this_pcpu (=== smp_processor_id()) from find_idle_target() and find_busiest_vsched() | |||
Alexandr Andreev | 237b70cd323 | [SCHED] Improve vcpu scheduling taking into account cache hotness In original OVZ kernel schedule_vcpu() takes next VCPU from vsched->active list, and it doesn't take in to account vcpu->last_pcpu, so VCPU's can jump from PCPU to PCPU too often.Try to skip 'hot' VCPU's, i.e. VCPU's that were running on some other PCPU recently. Time slice threshold is tunable via /proc/sys/kernel/vcpu_hot_timeslice | |||
Alexandr Andreev | 337726a8edb | [SCHED] find_busiest_queue() should select VCPUs from given vsched onlyIn new scheme, we choose vsched in find_busiest_vsched(), i.e. before find_busiest_queue(), so when we look for busiest queue we must consider this vsched VCPU's only. | |||
Alexandr Andreev | acc98e53058 | [SCHED] remove debug hunk from previous balance patch My previous patch for load_balance() contains wrong condition statement, that I forget to remove after debugging.load_balance() will not pull tasks from a busiest VCPU's, if there are < 2 tasks running on current VCPU. Attached patch removes this incorrect check and fixes the problem. | |||
Kirill Korotaev | 4c3dfbf42ff | [PATCH] Compilation fix fo idlebalance | |||
Alexandr Andreev | 4eeb956ef32 | [SCHED] Improve idle load balanceIdle balance is called from an idle thread on rebalance_tick(). load_balance() tries to find busiest group in idle_vsched, where there are no really running tasks. With this patch, load_balance() will try to find a busiest vsched first, and in case of success, then find busiest group inside this vsched, and so on... | |||
Andrey Mirkin | 4bf11416ab8 | [CPT] Fix IPv6 addresses restoreAll IPv6 addresses based on MAC are created with valid lifetime 0. We checkpoint them and try to restore, but fail as inet6_addr_add() returns -EINVAL if valid_lft is zero. We can use ifaddr flags to find correct values for prefered and valid life times. TODO: Kernel creates automatically local ipv6 address based on MAC address on it when interface is upped. We can manually remove this addre... | |||
Andrey Mirkin | 54170a488ad | [CPT] unlimit dcachesize on restoreRecently we have added adjusting of 3 limits on restore to not fail because of hitting limits. Now we have to add another one - dcachesize. | |||
Denis V. Lunev | 9aeff86fc2f | [NFS] fix lockd context when bind mounted from VE0 to VE | |||
Konstantin Khorenko | 9e27d0ad506 | [PROC] mainstream: race between proc_lookup() and sys_delete_module()Fix for the race between proc_lookup() and sys_delete_module(): proc_lookup() can find PDE under proc_subdir_lock, on 2nd CPU sys_delete_module() removes pde and module, then first CPU tries to get de and module in proc_get_inode()... Bum... | |||
Alexandr Andreev | 78c679a27a4 | [VESTATS] use jiffies instead of cycles for mm statsThis implementation if very simple but it's strictly not that accurate, because we can add 10 000 000 (or more) cycles (it's ~ 1 jiffy) even if actual allocation consumes < 10 000 cycles, but jiffy has been changed at the moment. | |||
Pavel Emelianov | b8617c8296d | [LOCKDEP] Another fix for virtualized filesystems lockdepAs described before, filesystems in our kernels are no longer static objects and thus lockdep refuses to work. This was (wrongly) fixed by setting one static class for all super block's semaphores and locks. It turned out that different filesystems use different lock ordering for sb locks and some other ones, e.g. UDF may take inode->i_mutex under sb->s_lock, while ext3 takes sb->s_lock under ... | |||
Vasily Tarasov | 760d938c43e | [IOPRIO] elevator switch oops fixWhen elevator switch happens and UBs persist, putting of async cfqq can happen second time due to non-NULL value in array. | |||
Vasily Tarasov | 83fda956c79 | [BC] vmguar_enough_memory() oopses if called form kernel threadIf vmguar_enough_memory() function is called by kernel thread, it oopses due to task_struct->mm equals NULL. Such situation was encountered when aufs was over ramfs. | |||
Alexey Dobriyan | f3871ef8075 | [BC] refcount leak in dup_mm() on error pathFix simple beancounter refcount leak on error path in dup_mm(). | |||
Andrey Mirkin | bdae7bcface | [BC] Fix potential beancounter refcount leakOn some error paths we forget to put beancounter. This patch fixes two such places: - sys_setluid() - bc_entry_open() | |||
Dmitriy Monakhov | a2a4d6a76ff | [EXT3] "ext[34]: EA block reference count racing fix" performance fix From: Andrew Morton <akpm@linux-foundation.org>A little mistake in 8a2bfdcbfa441d8b0e5cb9c9a7f45f77f80da465 is making all transactions synchronous, which reduces ext3 performance to comical levels. Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> | |||
Kirill Korotaev | 686ab8afd87 | [NMI] set default NMI watchdog timeout to 30 secsIncrease default NMI watchdog timeout to 30 seconds as it was in 2.6.9 | |||
Alexey Dobriyan | 64626d3df91 | [PATCH] mainstream: fix sys_accept() error path* d_alloc() in sock_attach_fd() fails leaving ->f_dentry NULL * bail out to out_fd label, which does fput()/__fput() on new file * but __fput() assumes valid ->f_dentry | |||
Pavel Emelianov | f3ffb2c91b6 | [BC] Check correct user_beancounter passed first in ub_page_unchargeIf page accidentally has a not-removed page_beancounter kernel will oops dereferencing ub->ub_percpu(). Move the BUG_ON upper to be sure we work with user_beancounter. | |||
Pavel Emelianov | 33c88831f75 | [BC] Remove redundant ub == NULL check from free_ub()All callers always pass non-NULL pointer. | |||
Pavel Emelianov | a9b3da91c61 | [BC] Don't make pre-created INDEX_AC and INDEX_L3 caches UBCThis made size-32 and size-64 caches on i386 be the same capacity as size-X(UBC) ones. | |||
OpenVZ team | 63694d20699 | linux-2.6.20-ovz004 released | |||
Vasily Tarasov | 686c66c5382 | [IOPRIO] Fix cfqq index calculation in async caseField ioprio of task_struct consits of two numbers: 1) value of class (bits 14-16), 2) value of data (bits 0-13). Value of data is allowed to belong the range [0, 7]. In current implementation of cfq_set_request tsk->ioprio is used as index of *async_cfqq[8] array. It is wrong because tsk->ioprio can be >> 8. This can cause to either corruption or reading insufficient value: cfq_set_request... | |||
Pavel Emelianov | d1109bdc15c | Fixed a misprint in kernel/sched.c after conflicts resolve :( | |||
OpenVZ team | 86714dd6ab1M | Merge /linux/kernel/git/stable/linux-2.6.20.y | |||
Vasily Tarasov | 0d243af4d66 | [UBIOPRIO] new cfq queue putting mechanismIt's better to use original cfqq put function from CFQ, when rewrite it. But we need somehow to export this function: use elevator_ops structure for it. | |||
Pavel Emelianov | 8307bf82e73 | Cleanup init_dev() to make OVZ patch smaller. |
Commits
Author | Commit | Commit date | Issues | |
---|---|---|---|---|
OpenVZ team | 7fb6904faa4 | |||
Alexandr Andreev | 4e6b27576b0 | |||
Alexandr Andreev | 60d4fc03c08 | |||
Alexey Kuznetsov | abe1582e6b9 | |||
Alexey Kuznetsov | b676b557c90 | |||
Pavel Emelianov | 78d95858492 | |||
Vasily Tarasov | 0fb7b42a294 | |||
Pavel Emelianov | ebba4d4778e | |||
Pavel Emelianov | 26fd0654761 | |||
Alexey Dobriyan | 39f3e4faa35 | |||
Alexey Dobriyan | 26d2416d9d2 | |||
Pavel Emelianov | ad4d59a0c82 | |||
Konstantin Khorenko | 1a98ff3dcd5 | |||
Evgeny Kravtsunov | a0771080e6f | |||
Vasily Tarasov | 255968d0558 | |||
Denis V. Lunev | d69c518c043 | |||
Pavel Emelianov | 57a997d6a3b | |||
Pavel Emelianov | 3f064957a63 | |||
Alexandr Andreev | 0f2fad545e1 | |||
Alexandr Andreev | 9b758cd113c | |||
Alexandr Andreev | 535be4f594d | |||
Alexandr Andreev | de939d37e50 | |||
Alexandr Andreev | 93db2629f01 | |||
Alexandr Andreev | 237b70cd323 | |||
Alexandr Andreev | 337726a8edb | |||
Alexandr Andreev | acc98e53058 | |||
Kirill Korotaev | 4c3dfbf42ff | |||
Alexandr Andreev | 4eeb956ef32 | |||
Andrey Mirkin | 4bf11416ab8 | |||
Andrey Mirkin | 54170a488ad | |||
Denis V. Lunev | 9aeff86fc2f | |||
Konstantin Khorenko | 9e27d0ad506 | |||
Alexandr Andreev | 78c679a27a4 | |||
Pavel Emelianov | b8617c8296d | |||
Vasily Tarasov | 760d938c43e | |||
Vasily Tarasov | 83fda956c79 | |||
Alexey Dobriyan | f3871ef8075 | |||
Andrey Mirkin | bdae7bcface | |||
Dmitriy Monakhov | a2a4d6a76ff | |||
Kirill Korotaev | 686ab8afd87 | |||
Alexey Dobriyan | 64626d3df91 | |||
Pavel Emelianov | f3ffb2c91b6 | |||
Pavel Emelianov | 33c88831f75 | |||
Pavel Emelianov | a9b3da91c61 | |||
OpenVZ team | 63694d20699 | |||
Vasily Tarasov | 686c66c5382 | |||
Pavel Emelianov | d1109bdc15c | |||
OpenVZ team | 86714dd6ab1M | |||
Vasily Tarasov | 0d243af4d66 | |||
Pavel Emelianov | 8307bf82e73 |