OpenVZ-legacy
  1. OpenVZ-legacy

linux-2.6.18-openvz

Public
AuthorCommitMessageCommit dateIssues
OpenVZ teamPavelOpenVZ team
b3d0298fdd4linux-2.6.18-028stab021 released
Vasily TarasovPavel EmelianovVasily Tarasov
f0967ebfed5[IOPRIO] request gets a beancounterPreviously, we could release a beancounter that has requests in-flight. Now, every request gets a ub, so beancounter can disappear only after all requests are accomplished. This reasoning concerns async requests only. Also note, that now the final put of beancounter can occur with queue lock held, so we modify release_beancounter() function in order it always use another thread for actual relea...
Denis V. LunevPavel EmelianovDenis V. Lunev
eda5b00e671[IOPRIO] Compilation fix for ub_prio
Alexey DobriyanPavel EmelianovAlexey Dobriyan
099052eb6bbLimit setluid caps in VE as in 2.6.9.
Alexey DobriyanPavel EmelianovAlexey Dobriyan
3671cac3a93As per debug session and discussions with Pavel, dealing with kmemsizeprecharges was done bit early, i.e. kmem_cache_free(sighand) could shortcut and bump kmemsize precharge again when freeing sighand, after ub_task_uncharge() dealt with it. So we need to deal with kmemsize precharged in ub_task_put()
Vasily TarasovPavel EmelianovVasily Tarasov
7d547b7630f[IOPRIO] get queue lock on async queues putWhen beancounter dies all async queues assosiated with it are removed from hash list. This operation should be protected by queue lock.
OpenVZ teamPavelOpenVZ team
80e174c9aaclinux-2.6.18-028stab020 released
Vasily TarasovPavel EmelianovVasily Tarasov
96f3f6cf053[IOPRIO] excess list_del in forced dispatching caseWe don't need to delete cfq_bc_data from active list in forced dispatching case, because it happens automatically: cfq_forced_dispatch_cfqqs() -> cfq_dispatch_insert() -> cfq_remove_request() -> cfq_del_crq_rb() -> cfq_del_cfqq_rr() -> ...
OpenVZ teamPavelOpenVZ team
223b47dc686linux-2.6.18-028test019 released
Andrey MirkinPavel EmelianovAndrey Mirkin
9b04401d071[VETH] MAC filtering on veth device1. enabled MAC address setting on veth devices from inside VE 2. enabled MAC filtering on veth by default. 3. also added checks for correctness (non-zero and non-multicast) of address.
Andrey SavochkinPavel EmelianovAndrey Savochkin
e6649e8423bFix for over-optimization of OTHERSOCKBUF accounting. For those sockets there is no protection by socket sock.Bug was provoked by optimization of charging/uncharging othersockbufs: diff-ubc-tcpsndopt-20060429 In brief idea is the following: optimization is based on assumption that soket is always locked by lock_sock and protected from using the socket by more than one users simultaneously. But current assumption is wrong for datagram sockets (for example PF_UNIX ones), that are not locked in the majo...
Alexey DobriyanPavel EmelianovAlexey Dobriyan
3fb47459ca2[PATCH] buffer: memorder fixunlock_buffer(), like unlock_page(), must not clear the lock without ensuring that the critical section is closed. Mingming later sent the same patch, saying: We are running SDET benchmark and saw double free issue for ext3 extended attributes block, which complains the same xattr block already being freed (in ext3_xattr_release_block()). The problem could also been triggered by multiple thr...
Eric SandeenPavel EmelianovEric Sandeen
184dd57af87[PATCH] return ENOENT from ext3_link when racing with unlinkReturn -ENOENT from ext[34]_link if we've raced with unlink and i_nlink is 0. Doing otherwise has the potential to corrupt the orphan inode list, because we'd wind up with an inode with a non-zero link count on the list, and it will never get properly cleaned up & removed from the orphan list before it is freed. [akpm@osdl.org: build fix] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <...
Kirill KorotaevPavel EmelianovKirill Korotaev
9773dc0823f[PATCH] make static counters in new_inode and iunique be 32 bitsFrom: Jeff Layton <jlayton@redhat.com> To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org When a 32-bit program that was not compiled with large file offsets does a stat and gets a st_ino value back that won't fit in the 32 bit field, glibc (correctly) generates an EOVERFLOW error. We can't do anything about fs's with larger permanent inode numbers, but when we generate them on th...
Vasily TarasovPavel EmelianovVasily Tarasov
b2188163f0a[IOPRIO] A handle to switch off write prioritizationAdds write_virt_mode file in /sys/block/<device>/queue/iosched/, that allows to switch off async requests prioritization. Prioritization switching off is achieved by setting an owner of newly created cfq_queues to UB0.
Vasily TarasovPavel EmelianovVasily Tarasov
7d191bf7c59[IOPRIO] Switches UB context in places necessary for proper write schedulingWe need information about owner of async requests. We get it from the page marked by IO Accounting feature and change context to the context of page's owner. Later cfq picks up current context and handle it properly (previous patch).
Vasily TarasovPavel EmelianovVasily Tarasov
bb718a47c9f[IOPRIO] Support of prioritized write inside cfqAll ubioprio patches before now do fair prioritization only for sync requests. For async requests the problem exists: who has actually produced the request? Using OVZ IO Accounting feature we can obtain this arcane knowledge. IO Accounting feature sets a mark to page: who's made it dirty. At the moment of request submitting we change context to the context of the owner of the page (this operati...
Vasily TarasovPavel EmelianovVasily Tarasov
e5b83d9656f[IOPRIO] A handle to switch virtmode off/onAdds virt_mode file in /sys/block/<device>/queue/iosched/, that allows to switch off virtualized mode. Virtualization switching off is achieved by setting an owner of newly created cfq_queues to UB0. Note, it means, the already created process, will still scheduled.
Vasily TarasovPavel EmelianovVasily Tarasov
1b0602c43c3[IOPRIO] Configurable base timeslice durationelevator's parameters can be managed using files in /sys/block/<device>/queue/iosched/. This patch adds there ub_slice file, that controls base_slice value. By default it is HZ/2.
Vasily TarasovPavel EmelianovVasily Tarasov
2a92327992f[IOPRIO] Add userspace interface for beancounter ioprio managementioprio_set() syscall is modified by this patch to get as a second parameter IOPRIO_WHO_UBC constant, that indicates, that ioprio should be set for beancounter. Additional information on using this syscall is located in Documentation/block/ioprio.txt file.
Vasily TarasovPavel EmelianovVasily Tarasov
0f23938a3fb[IOPRIO] Introduce active beancounter switchThis is very important patch, that implements actual scheduling of UBs. UB holds a list of active UBs (UBs that have I/O requests at the moment) in the last serviced order. It allows us to get next UB for service for O(1) time. The duration of time slice depends on UB priority by the following way: ub_slice = base_slice + (base_slice * (ioprio - UB_IOPRIO_MIN)) / (UB_IOPRIO_MAX - UB_IOPRIO_MIN)...
Vasily TarasovPavel EmelianovVasily Tarasov
8a8e15be204[IOPRIO] Handle forced dispatching caseSometimes driver asks elevator to throw down all requests in-flight to it (to driver). This is called forced dispatching. In CFQ it is handled by separate functions and this patch modifies them to work properly with ubioprio feature.
Vasily TarasovPavel EmelianovVasily Tarasov
dfcdd8ca977[IOPRIO] Switche cfq to use virtualized data structuresOne of the main part of ubioprio feature is to switch CFQ algorithm from working on per-device structures (cfq_data) to working on per-(device, ub) structures (cfq_bc_data). This patch does it.
Vasily TarasovPavel EmelianovVasily Tarasov
42a1c406acbAdd hooks on device queue creation/eliminationNow, when creation/elimination of device queue occurs all data structures are properly initialized.
Vasily TarasovPavel EmelianovVasily Tarasov
27e9397c1ac[IOPRIO] Introduce main data structures and functionsThe patch defines new data structures and functions that forms basic infrastructure of ubioprio feature. cfq_bc_data is a structure that represents pare: (block device, UB). It owns all virtualized data from original cfq_data structure. Each device (cfq_data) holds a list of cfq_bc_data structures. UB holds a list of active cfq_bc_data structures. Sources content more detailed description. The ...
Vasily TarasovPavel EmelianovVasily Tarasov
48d5939447d[IOPRIO] Adds appropriate description in Kconfig fileThe patch adds description of ubioprio feature to Kconfig file.
Vasily TarasovPavel EmelianovVasily Tarasov
67b80a9ea60This patch just moves required declarations and definitions from cfq-iosched.c to external cfq-iosched.h file.This patchset introduces I/O prioritization for beancounters. The feature is implemented on basis of CFQ I/O scheduler. In CFQ I/O scheduler each process obtains a time slice, which duration depends on priority of the process. We use the same approach for UBs. All the logic of I/O scheduling for processes belonging to particular UB is preserved. So we obtain something like 2-level CFQ scheduli...
Denis V. LunevPavel EmelianovDenis V. Lunev
8676f908b3fThis patch virtualizes NFS lockd inside VE.Plz note, that ve_allow_kthreads should be set to 1
Alexey DobriyanPavel EmelianovAlexey Dobriyan
5f2bbfa6383vm86(2) was never implemented on x86_64, always giving warnings. First, warning was rate-limited, then non-rate-limited warning, then half-assed warning which doesn't prevent dmesg spamming.So, remove it completely.
Denis V. LunevPavel EmelianovDenis V. Lunev
84962125e24OOM generation should be updated on mm destroy, not task exit.This patch calculates OOM generations directly. The counter is increased when MM of process killed by OOM is finally destroyed.
Denis V. LunevPavel EmelianovDenis V. Lunev
ac7f8597f50tcp_v4_get_port fixed. Treat local_port_range[0] > local_port_range[1]as local_port_range[1] == local_port_range[0].
Alexey KuznetsovPavel EmelianovAlexey Kuznetsov
17fb8518975Forgotten bits of pid virtualization in sys_wait*
Alexey KuznetsovPavel EmelianovAlexey Kuznetsov
a87352ffdb4[IA64] forgotten line in TIF_RESTORE_SIGMASK
Denis V. LunevPavel EmelianovDenis V. Lunev
4b17687ffaa[IPV6] MCAST: Fix joining all-node multicast group on device initialization. From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Date: Mon, 15 Jan 2007 05:48:40 +0000 (-0800)[IPV6] MCAST: Fix joining all-node multicast group on device initialization. Join all-node multicast group after assignment of dev->ip6_ptr because it must be assigned when ipv6_dev_mc_inc() is called. This fixes Bug#7817, reported by <gernoth@informatik.uni-erlangen.de>. Closes: 7817 Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.ne...
OpenVZ teamPavelOpenVZ team
cb83d0360delinux-2.6.18-028test018 released
Evgeny KravtsunovPavel EmelianovEvgeny Kravtsunov
b9bff58fc99Missed ve context switch in NFS RPC code.pipefs switches the context to ve0 and never returns to ve context. Such a situation takes place in __rpc_execute (net/sunrpc/sched.c) and svc_recvfrom (net/sunrpc/svcsock.c) functions. This causes oops on starting ve in case when ve private area is placed on nfs partition.
Vasiliy AverinPavel EmelianovVasiliy Averin
156048b79adext3 error behavior was broken in linux kernels since 2.5.x versions by the following patch:2002/10/31 02:15:26-05:00 tytso@snap.thunk.org Default mount options from superblock for ext2/3 filesystems http://linux.bkbits.net:8080/linux-2.6/gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ In case ext3 file system is mounted with errors=continue (EXT3_ERRORS_CONTINUE) errors should be ignored when possible. However at present in case of any error kernel aborts journal and remounts filesystem to ...
Vasily TarasovPavel EmelianovVasily Tarasov
67e1668cb46task puts UBC before the task becomes invisible for all (e.g. /proc),thus a task can be found on the list without exec_env/owner_env which should not happen. Introduced by diff-ubc-dont-uncharge-in-RCU-20070212RCU-20070212
Dmitry MishinPavel EmelianovDmitry Mishin
06d911fb785EXT3_ERRORS_CONTINUE should be taken from the superblock as default value for error behaviour.Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-by: Vasily Averin <vvs@sw.ru> Acked-by: Kirill Korotaev <dev@openvz.org>
Vasily AverinPavel EmelianovVasily Averin
5b12b303de2EXT2_ERRORS_CONTINUE should be read from the sb as default error behaviour. parse_option() should clean the alternative options and should not change default value taken from the superblock.Signed-off-by: Vasily Averin <vvs@sw.ru> Acked-by: Kirill Korotaev <dev@openvz.org>
Kirill KorotaevPavel EmelianovKirill Korotaev
d7d8cf0c663Revert diff-ms-ext3-retries-20061109 until all the issues are resolved.
Kir KolyshkinPavel EmelianovKir Kolyshkin
b3b2f114059Patch from mainstream: [SPARC64]: Fix Tomatillo/Schizo IRQ handling.The code in schizo_irq_trans_init() should set irq_data->sync_reg to the location of the SYNC register if this is Tomatillo, and set it to zero otherwise. But that is not what it is doing. As a result, non-Tomatillo systems were trying to access a non-existent register resulting in bus errors at the first PCI interrupt. Thanks to Roland Stigge for the bug report. Signed-off-by: David S. Mil...
Alexey DobriyanPavel EmelianovAlexey Dobriyan
6bcea8c4c35Same story as with p4-clockmod. Driver does set_cpus_allowed(cpu), then checks for smp_processor_id() being equal to "cpu".http://bugzilla.openvz.org/show_bug.cgi?id=467
OpenVZ teamPavel EmelianovOpenVZ team
6d6cd5dd70fMMerge git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.18.y
Greg Kroah-HartmanGreg Kroah-Hartman
299a2479bcaLinux 2.6.18.8
Hugh DickinsGreg Kroah-HartmanHugh Dickins
b3008f65500fix umask when noACL kernel meets extN tuned for ACLsFix insecure default behaviour reported by Tigran Aivazian: if an ext2 or ext3 filesystem is tuned to mount with "acl", but mounted by a kernel built without ACL support, then umask was ignored when creating inodes - though root or user has umask 022, touch creates files as 0666, and mkdir creates directories as 0777. This appears to have worked right until 2.6.11, when a fix to the default mo...
Badari PulavartyGreg Kroah-HartmanBadari Pulavarty
4f1e627105eFix for shmem_truncate_range() BUG_ON()Ran into BUG() while doing madvise(REMOVE) testing. If we are punching a hole into shared memory segment using madvise(REMOVE) and the entire hole is below the indirect blocks, we hit following assert. BUG_ON(limit <= SHMEM_NR_DIRECT); Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: ...
Hugh DickinsGreg Kroah-HartmanHugh Dickins
f102c840f7fmake ppc64 current preempt-safeRepeated -j20 kernel builds on a G5 Quad running an SMP PREEMPT kernel would often collapse within a day, some exec failing with "Bad address". In each case examined, load_elf_binary was doing a kernel_read, but generic_file_aio_read's access_ok saw current->thread.fs.seg as USER_DS instead of KERNEL_DS. objdump of filemap.o shows gcc 4.1.0 emitting "mr r5,r13 ... ld r9,416(r5)" here for get_p...
Hugh DickinsGreg Kroah-HartmanHugh Dickins
700019f9feafix msync error on unmapped areaFix the 2.6.18 sys_msync to report -ENOMEM correctly when an unmapped area falls within its range, and not to overshoot: to satisfy LSB 3.1 tests and to fix Debian Bug#394392. Took the 2.6.19 sys_msync as starting point (including its cleanup of repeated "current->mm"s), reintroducing the msync_interval and balance_dirty_pages_ratelimited_nr needed in 2.6.18. The misbehaviour fixed here may n...
Hugh DickinsGreg Kroah-HartmanHugh Dickins
dbee2bf2f31read_zero_pagealigned() locking fixRamiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range: kernel bugzilla 7645. Right: read_zero_pagealigned uses down_read of mmap_sem, but another thread's racing read of /dev/zero, or a normal fault, can easily set that pte again, in between zap_page_range and zeromap_page_range getting there. It's been wrong ever since 2.4.3. The simple fix is to use down_write instead, but tha...