Headline
CVE-2012-6638
The tcp_rcv_state_process function in net/ipv4/tcp_input.c in the Linux kernel before 3.2.24 allows remote attackers to cause a denial of service (kernel resource consumption) via a flood of SYN+FIN TCP packets, a different vulnerability than CVE-2012-2663.
commit f1e79c6abb1e5420fb35e9e8a634efec9fc9ba96 Author: Ben Hutchings Date: Wed Jul 25 04:11:50 2012 +0100 Linux 3.2.24 commit 5a270d169895de9529e2a9a2e610d6aa7e217cef Author: Ryan Bourgeois Date: Tue Jul 10 09:43:33 2012 -0700 HID: add support for 2012 MacBook Pro Retina commit b2e6ad7dfe26aac5bf136962d0b11d180b820d44 upstream. Add support for the 15’’ MacBook Pro Retina. The keyboard is the same as recent models. The patch needs to be synchronized with the bcm5974 patch for the trackpad - as usual. Patch originally written by clipcarl (forums.opensuse.org). [[email protected]: Amended mouse ignore lines] Signed-off-by: Ryan Bourgeois Signed-off-by: Henrik Rydberg Acked-by: Jiri Kosina Signed-off-by: Dmitry Torokhov Signed-off-by: Ben Hutchings commit a090e29a85d5d654848c5d564a7445fdf0e04bf2 Author: Yuri Khan Date: Wed Jul 11 22:12:31 2012 -0700 Input: xpad - add Andamiro Pump It Up pad commit e76b8ee25e034ab601b525abb95cea14aa167ed3 upstream. I couldn’t find the vendor ID in any of the online databases, but this mat has a Pump It Up logo on the top side of the controller compartment, and a disclaimer stating that Andamiro will not be liable on the bottom. Signed-off-by: Yuri Khan Signed-off-by: Dmitry Torokhov Signed-off-by: Ben Hutchings commit e30e7f7dfdaff81816662f074920afedc9b9bed9 Author: Ilia Katsnelson Date: Wed Jul 11 00:54:20 2012 -0700 Input: xpad - add signature for Razer Onza Tournament Edition commit cc71a7e899cc6b2ff41e1be48756782ed004d802 upstream. Signed-off-by: Ilia Katsnelson Signed-off-by: Dmitry Torokhov Signed-off-by: Ben Hutchings commit 6cfcea9befbd9740e856674e43fdbdf01c0b36bc Author: Yuri Khan Date: Wed Jul 11 00:49:18 2012 -0700 Input: xpad - handle all variations of Mad Catz Beat Pad commit 3ffb62cb9ac2430c2504c6ff9727d0f2476ef0bd upstream. The device should be handled by xpad driver instead of generic HID driver. Signed-off-by: Yuri Khan Acked-by: Jiri Kosina Signed-off-by: Dmitry Torokhov Signed-off-by: Ben Hutchings commit f7ca712ade543ecad0e37b0758f77eb0a84c2a40 Author: Henrik Rydberg Date: Tue Jul 10 09:43:57 2012 -0700 Input: bcm5974 - Add support for 2012 MacBook Pro Retina commit 3dde22a98e94eb18527f0ff0068fb2fb945e58d4 upstream. Add support for the 15’’ MacBook Pro Retina model (MacBookPro10,1). Patch originally written by clipcarl (forums.opensuse.org). Signed-off-by: Henrik Rydberg Signed-off-by: Dmitry Torokhov Signed-off-by: Ben Hutchings commit cbcdb622dcf9254668a18a31c4df97c170f86a7a Author: Eric W. Biederman Date: Mon Jul 9 10:51:45 2012 +0000 bonding: Manage /proc/net/bonding/ entries from the netdev events commit a64d49c3dd504b685f9742a2f3dcb11fb8e4345f upstream. It was recently reported that moving a bonding device between network namespaces causes warnings from /proc. It turns out after the move we were trying to add and to remove the /proc/net/bonding entries from the wrong network namespace. Move the bonding /proc registration code into the NETDEV_REGISTER and NETDEV_UNREGISTER events where the proc registration and unregistration will always happen at the right time. Signed-off-by: “Eric W. Biederman” Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 78d40a34b2e35e675b11185e2ff6bd8e9e5145b2 Author: Eric W. Biederman Date: Mon Jul 9 10:52:43 2012 +0000 bonding: debugfs and network namespaces are incompatible commit 96ca7ffe748bf91f851e6aa4479aa11c8b1122ba upstream. The bonding debugfs support has been broken in the presence of network namespaces since it has been added. The debugfs support does not handle multiple bonding devices with the same name in different network namespaces. I haven’t had any bug reports, and I’m not interested in getting any. Disable the debugfs support when network namespaces are enabled. Signed-off-by: “Eric W. Biederman” Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit b5ff5ad1b9299fccc11c801764fa23ca696161cf Author: Deepak Sikri Date: Sun Jul 8 21:14:45 2012 +0000 stmmac: Fix for nfs hang on multiple reboot commit 8e83989106562326bfd6aaf92174fe138efd026b upstream. It was observed that during multiple reboots nfs hangs. The status of receive descriptors shows that all the descriptors were in control of CPU, and none were assigned to DMA. Also the DMA status register confirmed that the Rx buffer is unavailable. This patch adds the fix for the same by adding the memory barriers to ascertain that the all instructions before enabling the Rx or Tx DMA are completed which involves the proper setting of the ownership bit in DMA descriptors. Signed-off-by: Deepak Sikri Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 91256cf4a1df58edad1147dd619d5bc50a1748dd Author: Davide Gerhard Date: Mon Jun 25 09:04:47 2012 +0200 ipheth: add support for iPad commit 6de0298ec9c1edaf330b71b57346241ece8f3346 upstream. This adds support for the iPad to the ipheth driver. (product id = 0x129a) Signed-off-by: Davide Gerhard Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit db4fd57ab669f33ca6ef67e9f0c8074906073726 Author: Rafael J. Wysocki Date: Tue May 29 21:21:07 2012 +0200 ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification commit dbe9a2edd17d843d80faf2b99f20a691c1853418 upstream. The comparison between the system sleep state being entered and the lowest system sleep state the given device may wake up from in acpi_pm_device_sleep_state() is reversed, because the specification (ACPI 5.0) says that for wakeup to work: “The sleeping state being entered must be less than or equal to the power state declared in element 1 of the _PRW object.” In other words, the state returned by _PRW is the deepest (lowest-power) system sleep state the device is capable of waking up the system from. Moreover, acpi_pm_device_sleep_state() also should check if the wakeup capability is supported through ACPI, because in principle it may be done via native PCIe PME, for example, in which case _SxW should not be evaluated. Signed-off-by: Rafael J. Wysocki Signed-off-by: Ben Hutchings commit 03d200117c06fde714344f8f4f0301f609959b53 Author: Tyler Hicks Date: Tue Jun 12 11:17:01 2012 -0700 eCryptfs: Properly check for O_RDONLY flag before doing privileged open commit 9fe79d7600497ed8a95c3981cbe5b73ab98222f0 upstream. If the first attempt at opening the lower file read/write fails, eCryptfs will retry using a privileged kthread. However, the privileged retry should not happen if the lower file’s inode is read-only because a read/write open will still be unsuccessful. The check for determining if the open should be retried was intended to be based on the access mode of the lower file’s open flags being O_RDONLY, but the check was incorrectly performed. This would cause the open to be retried by the privileged kthread, resulting in a second failed open of the lower file. This patch corrects the check to determine if the open request should be handled by the privileged kthread. Signed-off-by: Tyler Hicks Reported-by: Dan Carpenter Acked-by: Dan Carpenter Signed-off-by: Ben Hutchings commit b8b1e1ea3fa0b1de9a5eead6bab0c563617b9a2b Author: Tyler Hicks Date: Mon Jun 11 10:21:34 2012 -0700 eCryptfs: Fix lockdep warning in miscdev operations commit 60d65f1f07a7d81d3eb3b91fc13fca80f2fdbb12 upstream. Don’t grab the daemon mutex while holding the message context mutex. Addresses this lockdep warning: ecryptfsd/2141 is trying to acquire lock: (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}, at: [] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs] but task is already holding lock: (&(*daemon)->mux){+.+…}, at: [] ecryptfs_miscdev_read+0x21c/0x470 [ecryptfs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(*daemon)->mux){+.+…}: [] lock_acquire+0x9d/0x220 [] __mutex_lock_common+0x5a/0x4b0 [] mutex_lock_nested+0x44/0x50 [] ecryptfs_send_miscdev+0x97/0x120 [ecryptfs] [] ecryptfs_send_message+0x134/0x1e0 [ecryptfs] [] ecryptfs_generate_key_packet_set+0x2fe/0xa80 [ecryptfs] [] ecryptfs_write_metadata+0x108/0x250 [ecryptfs] [] ecryptfs_create+0x130/0x250 [ecryptfs] [] vfs_create+0xb4/0x120 [] do_last+0x8c5/0xa10 [] path_openat+0xd9/0x460 [] do_filp_open+0x42/0xa0 [] do_sys_open+0xf8/0x1d0 [] sys_open+0x21/0x30 [] system_call_fastpath+0x16/0x1b -> #0 (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}: [] __lock_acquire+0x1bf8/0x1c50 [] lock_acquire+0x9d/0x220 [] __mutex_lock_common+0x5a/0x4b0 [] mutex_lock_nested+0x44/0x50 [] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs] [] vfs_read+0xb3/0x180 [] sys_read+0x4d/0x90 [] system_call_fastpath+0x16/0x1b Signed-off-by: Tyler Hicks Signed-off-by: Ben Hutchings commit 4d5a83a81b462742fca6e592b4841f52dc68ec83 Author: Tyler Hicks Date: Mon Jun 11 09:24:11 2012 -0700 eCryptfs: Gracefully refuse miscdev file ops on inherited/passed files commit 8dc6780587c99286c0d3de747a2946a76989414a upstream. File operations on /dev/ecryptfs would BUG() when the operations were performed by processes other than the process that originally opened the file. This could happen with open files inherited after fork() or file descriptors passed through IPC mechanisms. Rather than calling BUG(), an error code can be safely returned in most situations. In ecryptfs_miscdev_release(), eCryptfs still needs to handle the release even if the last file reference is being held by a process that didn’t originally open the file. ecryptfs_find_daemon_by_euid() will not be successful, so a pointer to the daemon is stored in the file’s private_data. The private_data pointer is initialized when the miscdev file is opened and only used when the file is released. https://launchpad.net/bugs/994247 Signed-off-by: Tyler Hicks Reported-by: Sasha Levin Tested-by: Sasha Levin Signed-off-by: Ben Hutchings commit 391493aea95c02174396051e5ed3296325c131ee Author: Pavel Vasilyev Date: Tue Jun 5 00:02:05 2012 -0400 ACPI sysfs.c strlen fix commit 9f132652d94c96476b0b0a8caf0c10e96ab10fa8 upstream. Current code is ignoring the last character of “enable” and “disable” in comparisons. https://bugzilla.kernel.org/show_bug.cgi?id=33732 Signed-off-by: Len Brown Signed-off-by: Ben Hutchings commit 8e65352dd2b452e53eb5821af449b4750ac8e35d Author: Zhang Rui Date: Mon Feb 20 14:20:06 2012 +0800 ACPI, x86: fix Dell M6600 ACPI reboot regression via DMI commit 76eb9a30db4bc8fd172f9155247264b5f2686d7b upstream. Dell Precision M6600 is known to require PCI reboot, so add it to the reboot blacklist in pci_reboot_dmi_table[]. https://bugzilla.kernel.org/show_bug.cgi?id=42749 cc: [email protected] Signed-off-by: Zhang Rui Signed-off-by: Len Brown Signed-off-by: Ben Hutchings commit 80aa998e8260479a3c6bcaee28a961d7be2ba8dd Author: Feng Tang Date: Mon Jun 4 15:00:06 2012 +0800 ACPI: Add a quirk for “AMILO PRO V2030” to ignore the timer overriding commit b939c2acf1dc42b08407ef5174f2e8d6f43dd5ea upstream. commit f6b54f083cc66cf9b11d2120d8df3c2ad4e0836d upstream. This is the 2nd part of fix for kernel bugzilla 40002: “IRQ 0 assigned to VGA” https://bugzilla.kernel.org/show_bug.cgi?id=40002 The root cause is the buggy FW, whose ACPI tables assign the GSI 16 to 2 irqs 0 and 16(VGA), and the VGA is the right owner of GSI 16. So add a quirk to ignore the irq0 overriding GSI 16 for the FUJITSU SIEMENS AMILO PRO V2030 platform will solve this issue. Reported-and-tested-by: Szymon Kowalczyk Signed-off-by: Feng Tang Signed-off-by: Len Brown Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit d6f34925144b450a5a962ca8c3cbde1a38d60fc0 Author: Feng Tang Date: Mon Jun 4 15:00:05 2012 +0800 ACPI: Remove one board specific WARN when ignoring timer overriding commit 5752cdb805ff89942d99d12118e2844e7db34df8 upstream. commit 7f68b4c2e158019c2ec494b5cfbd9c83b4e5b253 upstream. Current WARN msg is only for the ati_ixp4x0 board, while this function is used by mulitple platforms. So this one board specific warning is not appropriate any more. Signed-off-by: Feng Tang Signed-off-by: Len Brown Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 51d5aa75ce4380a55714282a717fde81cce33c35 Author: Feng Tang Date: Mon Jun 4 15:00:04 2012 +0800 ACPI: Make acpi_skip_timer_override cover all source_irq==0 cases commit ae10ccdc3093486f8c2369d227583f9d79f628e5 upstream. Currently when acpi_skip_timer_override is set, it only cover the (source_irq == 0 && global_irq == 2) cases. While there is also platform which need use this option and its global_irq is not 2. This patch will extend acpi_skip_timer_override to cover all timer overriding cases as long as the source irq is 0. This is the first part of a fix to kernel bug bugzilla 40002: “IRQ 0 assigned to VGA” https://bugzilla.kernel.org/show_bug.cgi?id=40002 Reported-and-tested-by: Szymon Kowalczyk Signed-off-by: Feng Tang Signed-off-by: Len Brown Signed-off-by: Ben Hutchings commit 73a3346556281fd56f39f0a9475249e5039d8807 Author: Eric Dumazet Date: Thu Jun 14 06:42:44 2012 +0000 net: remove skb_orphan_try() commit 62b1a8ab9b3660bb820d8dfe23148ed6cda38574 upstream. Orphaning skb in dev_hard_start_xmit() makes bonding behavior unfriendly for applications sending big UDP bursts : Once packets pass the bonding device and come to real device, they might hit a full qdisc and be dropped. Without orphaning, the sender is automatically throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming sk_sndbuf is not too big) We could try to defer the orphaning adding another test in dev_hard_start_xmit(), but all this seems of little gain, now that BQL tends to make packets more likely to be parked in Qdisc queues instead of NIC TX ring, in cases where performance matters. Reverts commits : fc6055a5ba31 net: Introduce skb_orphan_try() 87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try() and removes SKBTX_DRV_NEEDS_SK_REF flag Reported-and-bisected-by: Jean-Michel Hautbois Signed-off-by: Eric Dumazet Tested-by: Oliver Hartkopp Acked-by: Oliver Hartkopp Signed-off-by: David S. Miller [bwh: Backported to 3.2: - Adjust context - SKBTX_WIFI_STATUS is not defined] Signed-off-by: Ben Hutchings commit dcf42d8ca45ca2009ead5cfae84c1c0de0a0af72 Author: Eric Dumazet Date: Wed Jun 13 09:45:16 2012 +0000 bnx2x: fix panic when TX ring is full commit bc14786a100cc6a81cd060e8031ec481241b418c upstream. There is a off by one error in the minimal number of BD in bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx queue. A full size GSO packet, with data included in skb->head really needs (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split() This error triggers if BQL is disabled and heavy TCP transmit traffic occurs. bnx2x_tx_split() definitely can be called, remove a wrong comment. Reported-by: Tomas Hruby Signed-off-by: Eric Dumazet Cc: Eilon Greenstein Cc: Yaniv Rosner Cc: Merav Sicron Cc: Tom Herbert Cc: Robert Evans Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit caac50847fd87dcc587181e3af3bd0aebf49964e Author: Eric Dumazet Date: Tue Jun 12 23:50:04 2012 +0000 bnx2x: fix checksum validation commit d6cb3e41386f20fb0777d0b59a2def82c65d37f7 upstream. bnx2x driver incorrectly sets ip_summed to CHECKSUM_UNNECESSARY on encapsulated segments. TCP stack happily accepts frames with bad checksums, if they are inside a GRE or IPIP encapsulation. Our understanding is that if no IP or L4 csum validation was done by the hardware, we should leave ip_summed as is (CHECKSUM_NONE), since hardware doesn’t provide CHECKSUM_COMPLETE support in its cqe. Then, if IP/L4 checksumming was done by the hardware, set CHECKSUM_UNNECESSARY if no error was flagged. Patch based on findings and analysis from Robert Evans Signed-off-by: Eric Dumazet Cc: Eilon Greenstein Cc: Yaniv Rosner Cc: Merav Sicron Cc: Tom Herbert Cc: Robert Evans Cc: Willem de Bruijn Acked-by: Eilon Greenstein Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust context, indentation] Signed-off-by: Ben Hutchings commit fc6fc50e4d74eb0f4f4c8a07b33d2a797a29082f Author: Devendra Naga Date: Thu May 31 01:51:20 2012 +0000 r8169: call netif_napi_del at errpaths and at driver unload commit ad1be8d345416a794dea39761a374032aa471a76 upstream. when register_netdev fails, the init’ed NAPIs by netif_napi_add must be deleted with netif_napi_del, and also when driver unloads, it should delete the NAPI before unregistering netdevice using unregister_netdev. Signed-off-by: Devendra Naga Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 1803c9c9746dd5cd524d6499ca24cbb4eacffa37 Author: Nadav Har’El Date: Mon Feb 27 15:07:29 2012 +0200 vhost: don’t forget to schedule() commit d550dda192c1bd039afb774b99485e88b70d7cb8 upstream. This is a tiny, but important, patch to vhost. Vhost’s worker thread only called schedule() when it had no work to do, and it wanted to go to sleep. But if there’s always work to do, e.g., the guest is running a network-intensive program like netperf with small message sizes, schedule() was *never* called. This had several negative implications (on non-preemptive kernels): 1. Passing time was not properly accounted to the “vhost” process (ps and top would wrongly show it using zero CPU time). 2. Sometimes error messages about RCU timeouts would be printed, if the core running the vhost thread didn’t schedule() for a very long time. 3. Worst of all, a vhost thread would “hog” the core. If several vhost threads need to share the same core, typically one would get most of the CPU time (and its associated guest most of the performance), while the others hardly get any work done. The trivial solution is to add if (need_resched()) schedule(); After doing every piece of work. This will not do the heavy schedule() all the time, just when the timer interrupt decided a reschedule is warranted (so need_resched returns true). Thanks to Abel Gordon for this patch. Signed-off-by: Nadav Har’El Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit 51326e07490692aa9049e2952bae36fc5824a63d Author: Andreas Schwab Date: Fri Dec 9 11:35:08 2011 +0000 powerpc: Fix wrong divisor in usecs_to_cputime commit 9f5072d4f63f28d30d343573830ac6c85fc0deff upstream. Commit d57af9b (taskstats: use real microsecond granularity for CPU times) renamed msecs_to_cputime to usecs_to_cputime, but failed to update all numbers on the way. This causes nonsensical cpu idle/iowait values to be displayed in /proc/stat (the only user of usecs_to_cputime so far). This also renames __cputime_msec_factor to __cputime_usec_factor, adapting its value and using it directly in cputime_to_usecs instead of doing two multiplications. Signed-off-by: Andreas Schwab Acked-by: Anton Blanchard Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Ben Hutchings commit 0ad70925ab61c308d669d93f02c7bf1974a5158f Author: Thomas Gleixner Date: Mon Jul 16 12:50:42 2012 -0400 timekeeping: Add missing update call in timekeeping_resume() This is a backport of 3e997130bd2e8c6f5aaa49d6e3161d4d29b43ab0 The leap second rework unearthed another issue of inconsistent data. On timekeeping_resume() the timekeeper data is updated, but nothing calls timekeeping_update(), so now the update code in the timer interrupt sees stale values. This has been the case before those changes, but then the timer interrupt was using stale data as well so this went unnoticed for quite some time. Add the missing update call, so all the data is consistent everywhere. Reported-by: Andreas Schwab Reported-and-tested-by: “Rafael J. Wysocki” Reported-and-tested-by: Martin Steigerwald Cc: LKML Cc: Linux PM list Cc: John Stultz Cc: Ingo Molnar Cc: Peter Zijlstra , Cc: Prarit Bhargava Signed-off-by: Thomas Gleixner Signed-off-by: John Stultz Signed-off-by: Linus Torvalds [John Stultz: Backported to 3.2] Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit 7b9a231293995c1211d254a38d60f21980ed5d07 Author: John Stultz Date: Tue Jul 10 18:43:25 2012 -0400 hrtimer: Update hrtimer base offsets each hrtimer_interrupt commit 5baefd6d84163443215f4a99f6a20f054ef11236 upstream. The update of the hrtimer base offsets on all cpus cannot be made atomically from the timekeeper.lock held and interrupt disabled region as smp function calls are not allowed there. clock_was_set(), which enforces the update on all cpus, is called either from preemptible process context in case of do_settimeofday() or from the softirq context when the offset modification happened in the timer interrupt itself due to a leap second. In both cases there is a race window for an hrtimer interrupt between dropping timekeeper lock, enabling interrupts and clock_was_set() issuing the updates. Any interrupt which arrives in that window will see the new time but operate on stale offsets. So we need to make sure that an hrtimer interrupt always sees a consistent state of time and offsets. ktime_get_update_offsets() allows us to get the current monotonic time and update the per cpu hrtimer base offsets from hrtimer_interrupt() to capture a consistent state of monotonic time and the offsets. The function replaces the existing ktime_get() calls in hrtimer_interrupt(). The overhead of the new function vs. ktime_get() is minimal as it just adds two store operations. This ensures that any changes to realtime or boottime offsets are noticed and stored into the per-cpu hrtimer base structures, prior to any hrtimer expiration and guarantees that timers are not expired early. Signed-off-by: John Stultz Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Prarit Bhargava Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner Signed-off-by: Ben Hutchings commit ec5806bcd08281a86e05b8e4eaf2f377bc8e5b24 Author: Thomas Gleixner Date: Tue Jul 10 18:43:24 2012 -0400 timekeeping: Provide hrtimer update function This is a backport of f6c06abfb3972ad4914cef57d8348fcb2932bc3b To finally fix the infamous leap second issue and other race windows caused by functions which change the offsets between the various time bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a function which atomically gets the current monotonic time and updates the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic overhead. The previous patch which provides ktime_t offsets allows us to make this function almost as cheap as ktime_get() which is going to be replaced in hrtimer_interrupt(). Signed-off-by: Thomas Gleixner Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Prarit Bhargava Signed-off-by: John Stultz Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner [John Stultz: Backported to 3.2] Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit d6a2a0400e36bcd2892272995a65b4bca029220f Author: Thomas Gleixner Date: Tue Jul 10 18:43:23 2012 -0400 hrtimers: Move lock held region in hrtimer_interrupt() commit 196951e91262fccda81147d2bcf7fdab08668b40 upstream. We need to update the base offsets from this code and we need to do that under base->lock. Move the lock held region around the ktime_get() calls. The ktime_get() calls are going to be replaced with a function which gets the time and the offsets atomically. Signed-off-by: Thomas Gleixner Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Prarit Bhargava Signed-off-by: John Stultz Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner Signed-off-by: Ben Hutchings commit a105e023adf852c9847e1fe7f0bc151ce0339914 Author: Thomas Gleixner Date: Tue Jul 10 18:43:21 2012 -0400 timekeeping: Maintain ktime_t based offsets for hrtimers This is a backport of 5b9fe759a678e05be4937ddf03d50e950207c1c0 We need to update the hrtimer clock offsets from the hrtimer interrupt context. To avoid conversions from timespec to ktime_t maintain a ktime_t based representation of those offsets in the timekeeper. This puts the conversion overhead into the code which updates the underlying offsets and provides fast accessible values in the hrtimer interrupt. Signed-off-by: Thomas Gleixner Signed-off-by: John Stultz Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Prarit Bhargava Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner [John Stultz: Backported to 3.2] Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit 8a1ba973a23bd94a4ce34ffa543bf0e0d4ec13ff Author: John Stultz Date: Tue Jul 10 18:43:20 2012 -0400 timekeeping: Fix leapsecond triggered load spike issue This is a backport of 4873fa070ae84a4115f0b3c9dfabc224f1bc7c51 The timekeeping code misses an update of the hrtimer subsystem after a leap second happened. Due to that timers based on CLOCK_REALTIME are either expiring a second early or late depending on whether a leap second has been inserted or deleted until an operation is initiated which causes that update. Unless the update happens by some other means this discrepancy between the timekeeping and the hrtimer data stays forever and timers are expired either early or late. The reported immediate workaround - $ data -s “`date`” - is causing a call to clock_was_set() which updates the hrtimer data structures. See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix Add the missing clock_was_set() call to update_wall_time() in case of a leap second event. The actual update is deferred to softirq context as the necessary smp function call cannot be invoked from hard interrupt context. Signed-off-by: John Stultz Reported-by: Jan Engelhardt Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Prarit Bhargava Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit 3c910e7e46810c73e21196867b7ec63d58a0a45c Author: John Stultz Date: Tue Jul 10 18:43:19 2012 -0400 hrtimer: Provide clock_was_set_delayed() commit f55a6faa384304c89cfef162768e88374d3312cb upstream. clock_was_set() cannot be called from hard interrupt context because it calls on_each_cpu(). For fixing the widely reported leap seconds issue it is necessary to call it from hard interrupt context, i.e. the timer tick code, which does the timekeeping updates. Provide a new function which denotes it in the hrtimer cpu base structure of the cpu on which it is called and raise the hrtimer softirq. We then execute the clock_was_set() notificiation from softirq context in run_hrtimer_softirq(). The hrtimer softirq is rarely used, so polling the flag there is not a performance issue. [ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get rid of all this ifdeffery ASAP ] Signed-off-by: John Stultz Reported-by: Jan Engelhardt Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Prarit Bhargava Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner Signed-off-by: Ben Hutchings commit b19f4db486601885ee4c40665f4e8509ec65b4d5 Author: Thomas Gleixner Date: Sun Nov 13 23:19:49 2011 +0000 time: Move common updates to a function This is a backport of cc06268c6a87db156af2daed6e96a936b955cc82 [John Stultz: While not a bugfix itself, it allows following fixes to backport in a more straightforward manner.] CC: Thomas Gleixner CC: Eric Dumazet CC: Richard Cochran Signed-off-by: Thomas Gleixner Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit 09e66e8d71833a897a82ec18484b388e729b6548 Author: John Stultz Date: Wed May 30 10:54:57 2012 -0700 timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond This is a backport of fad0c66c4bb836d57a5f125ecd38bed653ca863a which resolves a bug the previous commit. Commit 6b43ae8a61 (ntp: Fix leap-second hrtimer livelock) broke the leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC. Adjust wall_to_monotonic when NTP inserted a leapsecond. Reported-by: Richard Cochran Signed-off-by: John Stultz Tested-by: Richard Cochran Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit 76117661ce2f9ecf7f969a4aeafd1c74802deae7 Author: Richard Cochran Date: Thu Apr 26 14:11:32 2012 +0200 ntp: Correct TAI offset during leap second commit dd48d708ff3e917f6d6b6c2b696c3f18c019feed upstream. When repeating a UTC time value during a leap second (when the UTC time should be 23:59:60), the TAI timescale should not stop. The kernel NTP code increments the TAI offset one second too late. This patch fixes the issue by incrementing the offset during the leap second itself. Signed-off-by: Richard Cochran Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit a57ccabee60519dd90051266c00d038055b93878 Author: John Stultz Date: Tue Jul 17 03:05:14 2012 -0400 ntp: Fix leap-second hrtimer livelock This is a backport of 6b43ae8a619d17c4935c3320d2ef9e92bdeed05d This should have been backported when it was commited, but I mistook the problem as requiring the ntp_lock changes that landed in 3.4 in order for it to occur. Unfortunately the same issue can happen (with only one cpu) as follows: do_adjtimex() write_seqlock_irq(&xtime_lock); process_adjtimex_modes() process_adj_status() ntp_start_leap_timer() hrtimer_start() hrtimer_reprogram() tick_program_event() clockevents_program_event() ktime_get() seq = req_seqbegin(xtime_lock); [DEADLOCK] This deadlock will no always occur, as it requires the leap_timer to force a hrtimer_reprogram which only happens if its set and there’s no sooner timer to expire. NOTE: This patch, being faithful to the original commit, introduces a bug (we don’t update wall_to_monotonic), which will be resovled by backporting a following fix. Original commit message below: Since commit 7dffa3c673fbcf835cd7be80bb4aec8ad3f51168 the ntp subsystem has used an hrtimer for triggering the leapsecond adjustment. However, this can cause a potential livelock. Thomas diagnosed this as the following pattern: CPU 0 CPU 1 do_adjtimex() spin_lock_irq(&ntp_lock); process_adjtimex_modes(); timer_interrupt() process_adj_status(); do_timer() ntp_start_leap_timer(); write_lock(&xtime_lock); hrtimer_start(); update_wall_time(); hrtimer_reprogram(); ntp_tick_length() tick_program_event() spin_lock(&ntp_lock); clockevents_program_event() ktime_get() seq = req_seqbegin(xtime_lock); This patch tries to avoid the problem by reverting back to not using an hrtimer to inject leapseconds, and instead we handle the leapsecond processing in the second_overflow() function. The downside to this change is that on systems that support highres timers, the leap second processing will occur on a HZ tick boundary, (ie: ~1-10ms, depending on HZ) after the leap second instead of possibly sooner (~34us in my tests w/ x86_64 lapic). This patch applies on top of tip/timers/core. CC: Sasha Levin CC: Thomas Gleixner Reported-by: Sasha Levin Diagnoised-by: Thomas Gleixner Tested-by: Sasha Levin Cc: Prarit Bhargava Cc: Thomas Gleixner Cc: Linux Kernel Signed-off-by: John Stultz Signed-off-by: Ben Hutchings commit 9f1e3e0f9fae973747e113848f9a9d0a2e1867f9 Author: Mikulas Patocka Date: Fri Jul 20 14:25:07 2012 +0100 dm raid1: set discard_zeroes_data_unsupported commit 7c8d3a42fe1c58a7e8fd3f6a013e7d7b474ff931 upstream. We can’t guarantee that REQ_DISCARD on dm-mirror zeroes the data even if the underlying disks support zero on discard. So this patch sets ti->discard_zeroes_data_unsupported. For example, if the mirror is in the process of resynchronizing, it may happen that kcopyd reads a piece of data, then discard is sent on the same area and then kcopyd writes the piece of data to another leg. Consequently, the data is not zeroed. The flag was made available by commit 983c7db347db8ce2d8453fd1d89b7a4bb6920d56 (dm crypt: always disable discard_zeroes_data). Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon Signed-off-by: Ben Hutchings commit 0409d1635121277180269f8294ced70214cbd837 Author: Mikulas Patocka Date: Fri Jul 20 14:25:03 2012 +0100 dm raid1: fix crash with mirror recovery and discard commit 751f188dd5ab95b3f2b5f2f467c38aae5a2877eb upstream. This patch fixes a crash when a discard request is sent during mirror recovery. Firstly, some background. Generally, the following sequence happens during mirror synchronization: - function do_recovery is called - do_recovery calls dm_rh_recovery_prepare - dm_rh_recovery_prepare uses a semaphore to limit the number simultaneously recovered regions (by default the semaphore value is 1, so only one region at a time is recovered) - dm_rh_recovery_prepare calls __rh_recovery_prepare, __rh_recovery_prepare asks the log driver for the next region to recover. Then, it sets the region state to DM_RH_RECOVERING. If there are no pending I/Os on this region, the region is added to quiesced_regions list. If there are pending I/Os, the region is not added to any list. It is added to the quiesced_regions list later (by dm_rh_dec function) when all I/Os finish. - when the region is on quiesced_regions list, there are no I/Os in flight on this region. The region is popped from the list in dm_rh_recovery_start function. Then, a kcopyd job is started in the recover function. - when the kcopyd job finishes, recovery_complete is called. It calls dm_rh_recovery_end. dm_rh_recovery_end adds the region to recovered_regions or failed_recovered_regions list (depending on whether the copy operation was successful or not). The above mechanism assumes that if the region is in DM_RH_RECOVERING state, no new I/Os are started on this region. When I/O is started, dm_rh_inc_pending is called, which increases reg->pending count. When I/O is finished, dm_rh_dec is called. It decreases reg->pending count. If the count is zero and the region was in DM_RH_RECOVERING state, dm_rh_dec adds it to the quiesced_regions list. Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is in DM_RH_RECOVERING state, it could be added to quiesced_regions list multiple times or it could be added to this list when kcopyd is copying data (it is assumed that the region is not on any list while kcopyd does its jobs). This results in memory corruption and crash. There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests do not belong to any region, so they are always added to the sync list in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH requests. These bypasses avoid the crash possibility described above. These bypasses were improperly implemented for REQ_DISCARD when the mirror target gained discard support in commit 5fc2ffeabb9ee0fc0e71ff16b49f34f0ed3d05b4 (dm raid1: support discard). In do_writes, REQ_DISCARD requests is always added to the sync queue and immediately dispatched (even if the region is in DM_RH_RECOVERING). However, dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts. So it violates the rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list corruption described above. This patch changes it so that REQ_DISCARD requests follow the same path as REQ_FLUSH. This avoids the crash. Reference: https://bugzilla.redhat.com/837607 Signed-off-by: Mikulas Patocka Signed-off-by: Alasdair G Kergon Signed-off-by: Ben Hutchings commit 4060625241ce9236e16174b45fbb452bb5d846d1 Author: Boaz Harrosh Date: Fri Jun 8 02:02:30 2012 +0300 pnfs-obj: Fix __r4w_get_page when offset is beyond i_size commit c999ff68029ebd0f56ccae75444f640f6d5a27d2 upstream. It is very common for the end of the file to be unaligned on stripe size. But since we know it’s beyond file’s end then the XOR should be preformed with all zeros. Old code used to just read zeros out of the OSD devices, which is a great waist. But what scares me more about this situation is that, we now have pages attached to the file’s mapping that are beyond i_size. I don’t like the kind of bugs this calls for. Fix both birds, by returning a global zero_page, if offset is beyond i_size. TODO: Change the API to ->__r4w_get_page() so a NULL can be returned without being considered as error, since XOR API treats NULL entries as zero_pages. [Bug since 3.2. Should apply the same way to all Kernels since] Signed-off-by: Boaz Harrosh [bwh: Backported to 3.2: adjust for lack of wdata->header] Signed-off-by: Ben Hutchings commit 0ab51e8bbf2466f4426142773be0095ad0137848 Author: Boaz Harrosh Date: Fri Jun 8 05:29:40 2012 +0300 pnfs-obj: don’t leak objio_state if ore_write/read fails commit 9909d45a8557455ca5f8ee7af0f253debc851f1a upstream. [Bug since 3.2 Kernel] Signed-off-by: Boaz Harrosh Signed-off-by: Ben Hutchings commit 3e17c16b921d72d121393b618bfe92992c8f3d1c Author: Boaz Harrosh Date: Fri Jun 8 04:30:40 2012 +0300 ore: Remove support of partial IO request (NFS crash) commit 62b62ad873f2accad9222a4d7ffbe1e93f6714c1 upstream. Do to OOM situations the ore might fail to allocate all resources needed for IO of the full request. If some progress was possible it would proceed with a partial/short request, for the sake of forward progress. Since this crashes NFS-core and exofs is just fine without it just remove this contraption, and fail. TODO: Support real forward progress with some reserved allocations of resources, such as mem pools and/or bio_sets [Bug since 3.2 Kernel] CC: Benny Halevy Signed-off-by: Boaz Harrosh Signed-off-by: Ben Hutchings commit c7003a9e7062f9b5233c4a80c8eec19eae30d171 Author: Boaz Harrosh Date: Fri Jun 8 01:19:07 2012 +0300 ore: Fix NFS crash by supporting any unaligned RAID IO commit 9ff19309a9623f2963ac5a136782ea4d8b5d67fb upstream. In RAID_5/6 We used to not permit an IO that it’s end byte is not stripe_size aligned and spans more than one stripe. .i.e the caller must check if after submission the actual transferred bytes is shorter, and would need to resubmit a new IO with the remainder. Exofs supports this, and NFS was supposed to support this as well with it’s short write mechanism. But late testing has exposed a CRASH when this is used with none-RPC layout-drivers. The change at NFS is deep and risky, in it’s place the fix at ORE to lift the limitation is actually clean and simple. So here it is below. The principal here is that in the case of unaligned IO on both ends, beginning and end, we will send two read requests one like old code, before the calculation of the first stripe, and also a new site, before the calculation of the last stripe. If any “boundary” is aligned or the complete IO is within a single stripe. we do a single read like before. The code is clean and simple by splitting the old _read_4_write into 3 even parts: 1._read_4_write_first_stripe 2. _read_4_write_last_stripe 3. _read_4_write_execute And calling 1+3 at the same place as before. 2+3 before last stripe, and in the case of all in a single stripe then 1+2+3 is preformed additively. Why did I not think of it before. Well I had a strike of genius because I have stared at this code for 2 years, and did not find this simple solution, til today. Not that I did not try. This solution is much better for NFS than the previous supposedly solution because the short write was dealt with out-of-band after IO_done, which would cause for a seeky IO pattern where as in here we execute in order. At both solutions we do 2 separate reads, only here we do it within a single IO request. (And actually combine two writes into a single submission) NFS/exofs code need not change since the ORE API communicates the new shorter length on return, what will happen is that this case would not occur anymore. hurray!! [Stable this is an NFS bug since 3.2 Kernel should apply cleanly] Signed-off-by: Boaz Harrosh Signed-off-by: Ben Hutchings commit 10f26f99904373a9cc5e7913ab581991cd4c5940 Author: Artem Bityutskiy Date: Sat Jul 14 14:33:09 2012 +0300 UBIFS: fix a bug in empty space fix-up commit c6727932cfdb13501108b16c38463c09d5ec7a74 upstream. UBIFS has a feature called “empty space fix-up” which is a quirk to work-around limitations of dumb flasher programs. Namely, of those flashers that are unable to skip NAND pages full of 0xFFs while flashing, resulting in empty space at the end of half-filled eraseblocks to be unusable for UBIFS. This feature is relatively new (introduced in v3.0). The fix-up routine (fixup_free_space()) is executed only once at the very first mount if the superblock has the ‘space_fixup’ flag set (can be done with -F option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and writes it back to the same LEB. The routine assumes the image is pristine and does not have anything in the journal. There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly. All but one LEB of the log of a pristine file-system are empty. And one contains just a commit start node. And 'fixup_free_space()' just unmapped this LEB, which resulted in wiping the commit start node. As a result, some users were unable to mount the file-system next time with the following symptom: UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0 The root-cause of this bug was that 'fixup_free_space()' wrongly assumed that the beginning of empty space in the log head (c->lhead_offs) was known on mount. However, it is not the case - it was always 0. UBIFS does not store in it the master node and finds out by scanning the log on every mount. The fix is simple - just pass commit start node size instead of 0 to 'fixup_leb()'. Signed-off-by: Artem Bityutskiy Reported-by: Iwo Mergler Tested-by: Iwo Mergler Reported-by: James Nute Signed-off-by: Ben Hutchings commit dc9391958a838458732358a7f29d78dda7774c65 Author: David Daney Date: Thu Jul 19 09:11:14 2012 +0200 MIPS: Properly align the .data…init_task section. commit 7b1c0d26a8e272787f0f9fcc5f3e8531df3b3409 upstream. Improper alignment can lead to unbootable systems and/or random crashes. [[email protected]: This is a lond standing bug since 6eb10bc9e2deab06630261cd05c4cb1e9a60e980 (kernel.org) rsp. c422a10917f75fd19fa7fe070aaaa23e384dae6f (lmo) [MIPS: Clean up linker script using new linker script macros.] so dates back to 2.6.32.] Signed-off-by: David Daney Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/3881/ Signed-off-by: Ralf Baechle Signed-off-by: Ben Hutchings commit 987e84543bedce6d0e7ae8d986bb2f6e2cc31e53 Author: NeilBrown Date: Thu Jul 19 15:59:18 2012 +1000 md/raid1: close some possible races on write errors during resync commit 58e94ae18478c08229626daece2fc108a4a23261 upstream. commit 4367af556133723d0f443e14ca8170d9447317cb md/raid1: clear bad-block record when write succeeds. Added a ‘reschedule_retry’ call possibility at the end of end_sync_write, but didn’t add matching code at the end of sync_request_write. So if the writes complete very quickly, or scheduling makes it seem that way, then we can miss rescheduling the request and the resync could hang. Also commit 73d5c38a9536142e062c35997b044e89166e063b md: avoid races when stopping resync. Fix a race condition in this same code in end_sync_write but didn’t make the change in sync_request_write. This patch updates sync_request_write to fix both of those. Patch is suitable for 3.1 and later kernels. Reported-by: Alexander Lyakas Original-version-by: Alexander Lyakas Signed-off-by: NeilBrown Signed-off-by: Ben Hutchings commit 068bd5de42ebc3a2aa2ef876614d7c7d11c79f6d Author: NeilBrown Date: Thu Jul 19 15:59:18 2012 +1000 md: avoid crash when stopping md array races with closing other open fds. commit a05b7ea03d72f36edb0cec05e8893803335c61a0 upstream. md will refuse to stop an array if any other fd (or mounted fs) is using it. When any fs is unmounted of when the last open fd is closed all pending IO will be flushed (e.g. sync_blockdev call in __blkdev_put) so there will be no pending IO to worry about when the array is stopped. However in order to send the STOP_ARRAY ioctl to stop the array one must first get and open fd on the block device. If some fd is being used to write to the block device and it is closed after mdadm open the block device, but before mdadm issues the STOP_ARRAY ioctl, then there will be no last-close on the md device so __blkdev_put will not call sync_blockdev. If this happens, then IO can still be in-flight while md tears down the array and bad things can happen (use-after-free and subsequent havoc). So in the case where do_md_stop is being called from an open file descriptor, call sync_block after taking the mutex to ensure there will be no new openers. This is needed when setting a read-write device to read-only too. Reported-by: majianpeng Signed-off-by: NeilBrown Signed-off-by: Ben Hutchings commit e23f455279bab00dcd73e8dd26c5f262dd1e3cc2 Author: Aaditya Kumar Date: Tue Jul 17 15:48:07 2012 -0700 mm: fix lost kswapd wakeup in kswapd_stop() commit 1c7e7f6c0703d03af6bcd5ccc11fc15d23e5ecbe upstream. Offlining memory may block forever, waiting for kswapd() to wake up because kswapd() does not check the event kthread->should_stop before sleeping. The proper pattern, from Documentation/memory-barriers.txt, is: — waker — event_indicated = 1; wake_up_process(event_daemon); — sleeper — for (;;) { set_current_state(TASK_UNINTERRUPTIBLE); if (event_indicated) break; schedule(); } set_current_state() may be wrapped by: prepare_to_wait(); In the kswapd() case, event_indicated is kthread->should_stop. === offlining memory (waker) === kswapd_stop() kthread_stop() kthread->should_stop = 1 wake_up_process() wait_for_completion() === kswapd_try_to_sleep (sleeper) === kswapd_try_to_sleep() prepare_to_wait() . . schedule() . . finish_wait() The schedule() needs to be protected by a test of kthread->should_stop, which is wrapped by kthread_should_stop(). Reproducer: Do heavy file I/O in background. Do a memory offline/online in a tight loop Signed-off-by: Aaditya Kumar Acked-by: KOSAKI Motohiro Reviewed-by: Minchan Kim Acked-by: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 61c0f23450473cda9d2d784ea778361095d50bc3 Author: Jeff Layton Date: Fri Jul 6 07:09:42 2012 -0400 cifs: always update the inode cache with the results from a FIND_* commit cd60042cc1392e79410dc8de9e9c1abb38a29e57 upstream. When we get back a FIND_FIRST/NEXT result, we have some info about the dentry that we use to instantiate a new inode. We were ignoring and discarding that info when we had an existing dentry in the cache. Fix this by updating the inode in place when we find an existing dentry and the uniqueid is the same. Reported-and-Tested-by: Andrew Bartlett Reported-by: Bill Robertson Reported-by: Dion Edwards Signed-off-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit 8e1e19fe1940b5b438273e92036964b1230b6766 Author: Jeff Layton Date: Wed Jul 11 09:09:35 2012 -0400 cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap space commit 3ae629d98bd5ed77585a878566f04f310adbc591 upstream. We currently rely on being able to kmap all of the pages in an async read or write request. If you’re on a machine that has CONFIG_HIGHMEM set then that kmap space is limited, sometimes to as low as 512 slots. With 512 slots, we can only support up to a 2M r/wsize, and that’s assuming that we can get our greedy little hands on all of them. There are other users however, so it’s possible we’ll end up stuck with a size that large. Since we can’t handle a rsize or wsize larger than that currently, cap those options at the number of kmap slots we have. We could consider capping it even lower, but we currently default to a max of 1M. Might as well allow those luddites on 32 bit arches enough rope to hang themselves. A more robust fix would be to teach the send and receive routines how to contend with an array of pages so we don’t need to marshal up a kvec array at all. That’s a fairly significant overhaul though, so we’ll need this limit in place until that’s ready. Reported-by: Jian Li Signed-off-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit 1edae5d5207b5c11e734a465fd0d8618952f74b4 Author: Roland Dreier Date: Mon Jul 16 17:10:17 2012 -0700 target: Fix range calculation in WRITE SAME emulation when num blocks == 0 commit 1765fe5edcb83f53fc67edeb559fcf4bc82c6460 upstream. When NUMBER OF LOGICAL BLOCKS is 0, WRITE SAME is supposed to write all the blocks from the specified LBA through the end of the device. However, dev->transport->get_blocks(dev) (perhaps confusingly) returns the last valid LBA rather than the number of blocks, so the correct number of blocks to write starting with lba is dev->transport->get_blocks(dev) - lba + 1 (nab: Backport roland’s for-3.6 patch to for-3.5) Signed-off-by: Roland Dreier Signed-off-by: Nicholas Bellinger Signed-off-by: Ben Hutchings commit 6154c5bc1f1c0e02dd41bd96300c63e137bdee51 Author: Roland Dreier Date: Mon Jul 16 15:17:10 2012 -0700 target: Clean up returning errors in PR handling code commit d35212f3ca3bf4fb49d15e37f530c9931e2d2183 upstream. - instead of (PTR_ERR(file) < 0) just use IS_ERR(file) - return -EINVAL instead of EINVAL - all other error returns in target_scsi3_emulate_pr_out() use “goto out” – get rid of the one remaining straight “return.” Signed-off-by: Roland Dreier Signed-off-by: Nicholas Bellinger Signed-off-by: Ben Hutchings commit 9729de799b03fd6141e1b248f063575a531f8f00 Author: Anders Kaseorg Date: Sun Jul 15 17:14:25 2012 -0400 fifo: Do not restart open() if it already found a partner commit 05d290d66be6ef77a0b962ebecf01911bd984a78 upstream. If a parent and child process open the two ends of a fifo, and the child immediately exits, the parent may receive a SIGCHLD before its open() returns. In that case, we need to make sure that open() will return successfully after the SIGCHLD handler returns, instead of throwing EINTR or being restarted. Otherwise, the restarted open() would incorrectly wait for a second partner on the other end. The following test demonstrates the EINTR that was wrongly thrown from the parent’s open(). Change .sa_flags = 0 to .sa_flags = SA_RESTART to see a deadlock instead, in which the restarted open() waits for a second reader that will never come. (On my systems, this happens pretty reliably within about 5 to 500 iterations. Others report that it manages to loop ~forever sometimes; YMMV.) #include #include #include #include #include #include #include #include #define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0) void handler(int signum) {} int main() { struct sigaction act = {.sa_handler = handler, .sa_flags = 0}; CHECK(sigaction(SIGCHLD, &act, NULL)); CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0)); for (;;) { int fd; pid_t pid; putc(‘.’, stderr); CHECK(pid = fork()); if (pid == 0) { CHECK(fd = open("fifo", O_RDONLY)); _exit(0); } CHECK(fd = open("fifo", O_WRONLY)); CHECK(close(fd)); CHECK(waitpid(pid, NULL, 0)); } } This is what I suspect was causing the Git test suite to fail in t9010-svn-fe.sh: http://bugs.debian.org/678852 Signed-off-by: Anders Kaseorg Reviewed-by: Jonathan Nieder Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 60d448613a4cf0b8e70c0d16abc6a6baf31ab9b5 Author: Mark Rustad Date: Fri Jul 13 18:18:04 2012 -0700 tcm_fc: Fix crash seen with aborts and large reads commit 3cc5d2a6b9a2fd1bf024aa5e52dd22961eecaf13 upstream. This patch fixes a crash seen when large reads have their exchange aborted by either timing out or being reset. Because the exchange abort results in the seq pointer being set to NULL, because the sequence is no longer valid, it must not be dereferenced. This patch changes the function ft_get_task_tag to return ~0 if it is unable to get the tag for this reason. Because the get_task_tag interface provides no means of returning an error, this seems like the best way to fix this issue at the moment. Signed-off-by: Mark Rustad Signed-off-by: Nicholas Bellinger Signed-off-by: Ben Hutchings commit df4372866de83261b79bf7af24c058649ed0f2eb Author: Tushar Dave Date: Thu Jul 12 08:56:56 2012 +0000 e1000e: Correct link check logic for 82571 serdes commit d0efa8f23a644f7cb7d1f8e78dd9a223efa412a3 upstream. SYNCH bit and IV bit of RXCW register are sticky. Before examining these bits, RXCW should be read twice to filter out one-time false events and have correct values for these bits. Incorrect values of these bits in link check logic can cause weird link stability issues if auto-negotiation fails. Reported-by: Dean Nelson Signed-off-by: Tushar Dave Reviewed-by: Bruce Allan Tested-by: Jeff Pieper Signed-off-by: Jeff Kirsher Signed-off-by: Ben Hutchings commit 154f4399c91b93519cd4a0ba0eceb21ced26c241 Author: Emmanuel Grumbach Date: Wed Jul 4 13:59:08 2012 +0200 iwlegacy: don’t mess up the SCD when removing a key commit b48d96652626b315229b1b82c6270eead6a77a6d upstream. When we remove a key, we put a key index which was supposed to tell the fw that we are actually removing the key. But instead the fw took that index as a valid index and messed up the SRAM of the device. This memory corruption on the device mangled the data of the SCD. The impact on the user is that SCD queue 2 got stuck after having removed keys. Reported-by: Paul Bolle Signed-off-by: Emmanuel Grumbach Signed-off-by: Stanislaw Gruszka Signed-off-by: John W. Linville [bwh: Backported to 3.2: adjust filename, context and variable name] Signed-off-by: Ben Hutchings commit c9d907ded47a70b9cd66a39aa09dbce24c363ffa Author: Stanislaw Gruszka Date: Wed Jul 4 13:20:20 2012 +0200 iwlegacy: always monitor for stuck queues commit c2ca7d92ed4bbd779516beb6eb226e19f7f7ab0f upstream. This is iwlegacy version of: commit 342bbf3fee2fa9a18147e74b2e3c4229a4564912 Author: Johannes Berg Date: Sun Mar 4 08:50:46 2012 -0800 iwlwifi: always monitor for stuck queues If we only monitor while associated, the following can happen: - we’re associated, and the queue stuck check runs, setting the queue “touch” time to X - we disassociate, stopping the monitoring, which leaves the time set to X - almost 2s later, we associate, and enqueue a frame - before the frame is transmitted, we monitor for stuck queues, and find the time set to X, although it is now later than X + 2000ms, so we decide that the queue is stuck and erroneously restart the device Signed-off-by: Stanislaw Gruszka Signed-off-by: John W. Linville [bwh: Backported to 3.2: adjust filename, function and variable names] Signed-off-by: Ben Hutchings commit 042083036e9d93fe1984e93b5e852c176e7fa0b5 Author: Stanislaw Gruszka Date: Wed Jul 4 13:10:02 2012 +0200 rt2x00usb: fix indexes ordering on RX queue kick commit efd821182cec8c92babef6e00a95066d3252fda4 upstream. On rt2x00_dmastart() we increase index specified by Q_INDEX and on rt2x00_dmadone() we increase index specified by Q_INDEX_DONE. So entries between Q_INDEX_DONE and Q_INDEX are those we currently process in the hardware. Entries between Q_INDEX and Q_INDEX_DONE are those we can submit to the hardware. According to that fix rt2x00usb_kick_queue(), as we need to submit RX entries that are not processed by the hardware. It worked before only for empty queue, otherwise was broken. Note that for TX queues indexes ordering are ok. We need to kick entries that have filled skb, but was not submitted to the hardware, i.e. started from Q_INDEX_DONE and have ENTRY_DATA_PENDING bit set. From practical standpoint this fixes RX queue stall, usually reproducible in AP mode, like for example reported here: https://bugzilla.redhat.com/show_bug.cgi?id=828824 Reported-and-tested-by: Franco Miceli Reported-and-tested-by: Tom Horsley Signed-off-by: Stanislaw Gruszka Signed-off-by: John W. Linville Signed-off-by: Ben Hutchings commit e0dc11cd5466e1a433dde19ed0222dc08d9b8338 Author: Cloud Ren Date: Tue Jul 3 16:51:48 2012 +0000 atl1c: fix issue of transmit queue 0 timed out commit b94e52f62683dc0b00c6d1b58b80929a078c0fd5 upstream. some people report atl1c could cause system hang with following kernel trace info: --------------------------------------- WARNING: at…/net/sched/sch_generic.c:258 dev_watchdog+0x1db/0x1d0() … NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out … --------------------------------------- This is caused by netif_stop_queue calling when cable Link is down. So remove netif_stop_queue, because link_watch will take it over. Signed-off-by: xiong Signed-off-by: Cloud Ren Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 023f9dff8c3f00dffb38f5185493df46d251c9f8 Author: Takashi Iwai Date: Mon Jun 25 15:07:17 2012 +0200 intel_ips: blacklist HP ProBook laptops commit 88ca518b0bb4161e5f20f8a1d9cc477cae294e54 upstream. intel_ips driver spews the warning message “ME failed to update for more than 1s, likely hung” at each second endlessly on HP ProBook laptops with IronLake. As this has never worked, better to blacklist the driver for now. Signed-off-by: Takashi Iwai Signed-off-by: Matthew Garrett Signed-off-by: Ben Hutchings commit 1885f653e200137b46671a61ca321ab2356821b7 Author: Michal Kazior Date: Fri Jun 8 10:55:44 2012 +0200 cfg80211: check iface combinations only when iface is running commit f8cdddb8d61d16a156229f0910f7ecfc7a82c003 upstream. Don’t validate interface combinations on a stopped interface. Otherwise we might end up being able to create a new interface with a certain type, but won’t be able to change an existing interface into that type. This also skips some other functions when interface is stopped and changing interface type. Signed-off-by: Michal Kazior Signed-off-by: Johannes Berg Signed-off-by: Ben Hutchings commit 6ed6791a1697afcb1615b4252d0c304a743b5f4d Author: Bojan Smojver Date: Sun Apr 29 22:42:06 2012 +0200 PM / Hibernate: Hibernate/thaw fixes/improvements commit 5a21d489fd9541a4a66b9a500659abaca1b19a51 upstream. 1. Do not allocate memory for buffers from emergency pools, unless absolutely required. Do not warn about and do not retry non-essential failed allocations. 2. Do not check the amount of free pages left on every single page write, but wait until one map is completely populated and then check. 3. Set maximum number of pages for read buffering consistently, instead of inadvertently depending on the size of the sector type. 4. Fix copyright line, which I missed when I submitted the hibernation threading patch. 5. Dispense with bit shifting arithmetic to improve readability. 6. Really recalculate the number of pages required to be free after all allocations have been done. 7. Fix calculation of pages required for read buffering. Only count in pages that do not belong to high memory. Signed-off-by: Bojan Smojver Signed-off-by: Rafael J. Wysocki Signed-off-by: Ben Hutchings commit bce3ff4945779822c2d778ac29011d5eecc7330a Author: Samuel Ortiz Date: Thu May 10 19:45:51 2012 +0200 NFC: Export nfc.h to userland commit dbd4fcaf8d664fab4163b1f8682e41ad8bff3444 upstream. The netlink commands and attributes, along with the socket structure definitions need to be exported. Signed-off-by: Samuel Ortiz Signed-off-by: John W. Linville Signed-off-by: Ben Hutchings commit 8f2c5a741019b7ed10643040fe72e6152ee16ed6 Author: Dave Jones Date: Fri Jul 13 13:35:36 2012 -0400 Remove easily user-triggerable BUG from generic_setlease commit 8d657eb3b43861064d36241e88d9d61c709f33f0 upstream. This can be trivially triggered from userspace by passing in something unexpected. kernel BUG at fs/locks.c:1468! invalid opcode: 0000 [#1] SMP RIP: 0010:generic_setlease+0xc2/0x100 Call Trace: __vfs_setlease+0x35/0x40 fcntl_setlease+0x76/0x150 sys_fcntl+0x1c6/0x810 system_call_fastpath+0x1a/0x1f Signed-off-by: Dave Jones Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 631a86fc5ceb30586f87e5fb9cb7827c8c10f570 Author: Jeff Moyer Date: Thu Jul 12 09:43:14 2012 -0400 block: fix infinite loop in __getblk_slow commit 91f68c89d8f35fe98ea04159b9a3b42d0149478f upstream. Commit 080399aaaf35 (“block: don’t mark buffers beyond end of disk as mapped”) exposed a bug in __getblk_slow that causes mount to hang as it loops infinitely waiting for a buffer that lies beyond the end of the disk to become uptodate. The problem was initially reported by Torsten Hilbrich here: https://lkml.org/lkml/2012/6/18/54 and also reported independently here: http://www.sysresccd.org/forums/viewtopic.php?f=13&t=4511 and then Richard W.M. Jones and Marcos Mello noted a few separate bugzillas also associated with the same issue. This patch has been confirmed to fix: https://bugzilla.redhat.com/show_bug.cgi?id=835019 The main problem is here, in __getblk_slow: for (;;) { struct buffer_head * bh; int ret; bh = __find_get_block(bdev, block, size); if (bh) return bh; ret = grow_buffers(bdev, block, size); if (ret < 0) return NULL; if (ret == 0) free_more_memory(); } __find_get_block does not find the block, since it will not be marked as mapped, and so grow_buffers is called to fill in the buffers for the associated page. I believe the for (;;) loop is there primarily to retry in the case of memory pressure keeping grow_buffers from succeeding. However, we also continue to loop for other cases, like the block lying beond the end of the disk. So, the fix I came up with is to only loop when grow_buffers fails due to memory allocation issues (return value of 0). The attached patch was tested by myself, Torsten, and Rich, and was found to resolve the problem in call cases. Signed-off-by: Jeff Moyer Reported-and-Tested-by: Torsten Hilbrich Tested-by: Richard W.M. Jones Reviewed-by: Josh Boyer [ Jens is on vacation, taking this directly - Linus ] – Stable Notes: this patch requires backport to 3.0, 3.2 and 3.3. Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 3bbc9e1917b93542c0da222e06496f37d1e82182 Author: Todd Poynor Date: Fri Jul 13 15:30:48 2012 +0900 ARM: SAMSUNG: fix race in s3c_adc_start for ADC commit 8265981bb439f3ecc5356fb877a6c2a6636ac88a upstream. Checking for adc->ts_pend already claimed should be done with the lock held. Signed-off-by: Todd Poynor Acked-by: Ben Dooks Signed-off-by: Kukjin Kim Signed-off-by: Ben Hutchings commit eb42f93d7f329d7ff24d99975ce144f01b1990fb Author: Jean Delvare Date: Thu Jul 12 22:47:37 2012 +0200 hwmon: (it87) Preserve configuration register bits on init commit 41002f8dd5938d5ad1d008ce5bfdbfe47fa7b4e8 upstream. We were accidentally losing one bit in the configuration register on device initialization. It was reported to freeze one specific system right away. Properly preserve all bits we don’t explicitly want to change in order to prevent that. Reported-by: Stevie Trujillo Signed-off-by: Jean Delvare Reviewed-by: Guenter Roeck Signed-off-by: Ben Hutchings commit 51954298d48d86e5b7b11c9aeab316a407a039e9 Author: Thomas Renninger Date: Thu Jul 12 12:24:33 2012 +0200 cpufreq / ACPI: Fix not loading acpi-cpufreq driver regression commit c4686c71a9183f76e3ef59098da5c098748672f6 upstream. Commit d640113fe80e45ebd4a5b420b introduced a regression on SMP systems where the processor core with ACPI id zero is disabled (typically should be the case because of hyperthreading). The regression got spread through stable kernels. On 3.0.X it got introduced via 3.0.18. Such platforms may be rare, but do exist. Look out for a disabled processor with acpi_id 0 in dmesg: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x10] disabled) This problem has been observed on a: HP Proliant BL280c G6 blade This patch restricts the introduced workaround to platforms with nr_cpu_ids <= 1. Signed-off-by: Thomas Renninger Signed-off-by: Rafael J. Wysocki Signed-off-by: Ben Hutchings commit 3501ec357a73226d2b3c1bd204d97926f61b3eae Author: Bob Liu Date: Wed Jul 11 14:02:35 2012 -0700 fs: ramfs: file-nommu: add SetPageUptodate() commit fea9f718b3d68147f162ed2d870183ce5e0ad8d8 upstream. There is a bug in the below scenario for !CONFIG_MMU: 1. create a new file 2. mmap the file and write to it 3. read the file can’t get the correct value Because sys_read() -> generic_file_aio_read() -> simple_readpage() -> clear_page() which causes the page to be zeroed. Add SetPageUptodate() to ramfs_nommu_expand_for_mapping() so that generic_file_aio_read() do not call simple_readpage(). Signed-off-by: Bob Liu Cc: Hugh Dickins Cc: David Howells Cc: Geert Uytterhoeven Cc: Greg Ungerer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit d1d5b31c0f73da39cc084af6240601543763f99d Author: Benoît Thébaudeau Date: Wed Jul 11 14:02:32 2012 -0700 drivers/rtc/rtc-mxc.c: fix irq enabled interrupts warning commit b59f6d1febd6cbe9fae4589bf72da0ed32bc69e0 upstream. Fixes WARNING: at irq/handle.c:146 handle_irq_event_percpu+0x19c/0x1b8() irq 25 handler mxc_rtc_interrupt+0x0/0xac enabled interrupts Modules linked in: (unwind_backtrace+0x0/0xf0) from (warn_slowpath_common+0x4c/0x64) (warn_slowpath_common+0x4c/0x64) from (warn_slowpath_fmt+0x30/0x40) (warn_slowpath_fmt+0x30/0x40) from (handle_irq_event_percpu+0x19c/0x1b8) (handle_irq_event_percpu+0x19c/0x1b8) from (handle_irq_event+0x28/0x38) (handle_irq_event+0x28/0x38) from (handle_level_irq+0x80/0xc4) (handle_level_irq+0x80/0xc4) from (generic_handle_irq+0x24/0x38) (generic_handle_irq+0x24/0x38) from (handle_IRQ+0x30/0x84) (handle_IRQ+0x30/0x84) from (avic_handle_irq+0x2c/0x4c) (avic_handle_irq+0x2c/0x4c) from (__irq_svc+0x40/0x60) Exception stack(0xc050bf60 to 0xc050bfa8) bf60: 00000001 00000000 003c4208 c0018e20 c050a000 c050a000 c054a4c8 c050a000 bf80: c05157a8 4117b363 80503bb4 00000000 01000000 c050bfa8 c0018e2c c000e808 bfa0: 60000013 ffffffff (__irq_svc+0x40/0x60) from (default_idle+0x1c/0x30) (default_idle+0x1c/0x30) from (cpu_idle+0x68/0xa8) (cpu_idle+0x68/0xa8) from (start_kernel+0x22c/0x26c) Signed-off-by: Benoît Thébaudeau Cc: Alessandro Zummo Cc: Sascha Hauer Acked-by: Uwe Kleine-König Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit e13d6560aa9799d9720afea1480093166f6e767a Author: David Rientjes Date: Wed Jul 11 14:02:13 2012 -0700 mm, thp: abort compaction if migration page cannot be charged to memcg commit 4bf2bba3750f10aa9e62e6949bc7e8329990f01b upstream. If page migration cannot charge the temporary page to the memcg, migrate_pages() will return -ENOMEM. This isn’t considered in memory compaction however, and the loop continues to iterate over all pageblocks trying to isolate and migrate pages. If a small number of very large memcgs happen to be oom, however, these attempts will mostly be futile leading to an enormous amout of cpu consumption due to the page migration failures. This patch will short circuit and fail memory compaction if migrate_pages() returns -ENOMEM. COMPACT_PARTIAL is returned in case some migrations were successful so that the page allocator will retry. Signed-off-by: David Rientjes Acked-by: Mel Gorman Cc: Minchan Kim Cc: Kamezawa Hiroyuki Cc: Rik van Riel Cc: Andrea Arcangeli Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 5eb695db655214b28a44040db35c43178541bcc8 Author: Luis Henriques Date: Wed Jul 11 14:02:10 2012 -0700 ocfs2: fix NULL pointer dereference in __ocfs2_change_file_space() commit a4e08d001f2e50bb8b3c4eebadcf08e5535f02ee upstream. As ocfs2_fallocate() will invoke __ocfs2_change_file_space() with a NULL as the first parameter (file), it may trigger a NULL pointer dereferrence due to a missing check. Addresses http://bugs.launchpad.net/bugs/1006012 Signed-off-by: Luis Henriques Reported-by: Bret Towe Tested-by: Bret Towe Cc: Sunil Mushran Acked-by: Joel Becker Acked-by: Mark Fasheh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 0906d248e8fddbd583fe4dd15631f4c2be2eb3b4 Author: Jiang Liu Date: Wed Jul 11 14:01:52 2012 -0700 memory hotplug: fix invalid memory access caused by stale kswapd pointer commit d8adde17e5f858427504725218c56aef90e90fc7 upstream. kswapd_stop() is called to destroy the kswapd work thread when all memory of a NUMA node has been offlined. But kswapd_stop() only terminates the work thread without resetting NODE_DATA(nid)->kswapd to NULL. The stale pointer will prevent kswapd_run() from creating a new work thread when adding memory to the memory-less NUMA node again. Eventually the stale pointer may cause invalid memory access. An example stack dump as below. It’s reproduced with 2.6.32, but latest kernel has the same issue. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] exit_creds+0x12/0x78 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/memory/memory391/state CPU 11 Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq microcode fuse loop dm_mod tpm_tis rtc_cmos i2c_i801 rtc_core tpm serio_raw pcspkr sg tpm_bios igb i2c_core iTCO_wdt rtc_lib mptctl iTCO_vendor_support button dca bnx2 usbhid hid uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif edd ext3 mbcache jbd fan ide_pci_generic ide_core ata_generic ata_piix libata thermal processor thermal_sys hwmon mptsas mptscsih mptbase scsi_transport_sas scsi_mod Pid: 7949, comm: sh Not tainted 2.6.32.12-qiuxishi-5-default #92 Tecal RH2285 RIP: 0010:exit_creds+0x12/0x78 RSP: 0018:ffff8806044f1d78 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff880604f22140 RCX: 0000000000019502 RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000 RBP: ffff880604f22150 R08: 0000000000000000 R09: ffffffff81a4dc10 R10: 00000000000032a0 R11: ffff880006202500 R12: 0000000000000000 R13: 0000000000c40000 R14: 0000000000008000 R15: 0000000000000001 FS: 00007fbc03d066f0(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000060f029000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sh (pid: 7949, threadinfo ffff8806044f0000, task ffff880603d7c600) Stack: ffff880604f22140 ffffffff8103aac5 ffff880604f22140 ffffffff8104d21e ffff880006202500 0000000000008000 0000000000c38000 ffffffff810bd5b1 0000000000000000 ffff880603d7c600 00000000ffffdd29 0000000000000003 Call Trace: __put_task_struct+0x5d/0x97 kthread_stop+0x50/0x58 offline_pages+0x324/0x3da memory_block_change_state+0x179/0x1db store_mem_state+0x9e/0xbb sysfs_write_file+0xd0/0x107 vfs_write+0xad/0x169 sys_write+0x45/0x6e system_call_fastpath+0x16/0x1b Code: ff 4d 00 0f 94 c0 84 c0 74 08 48 89 ef e8 1f fd ff ff 5b 5d 31 c0 41 5c c3 53 48 8b 87 20 06 00 00 48 89 fb 48 8b bf 18 06 00 00 <8b> 00 48 c7 83 18 06 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0 RIP exit_creds+0x12/0x78 RSP CR2: 0000000000000000 [[email protected]: add pglist_data.kswapd locking comments] Signed-off-by: Xishi Qiu Signed-off-by: Jiang Liu Acked-by: KAMEZAWA Hiroyuki Acked-by: KOSAKI Motohiro Acked-by: Mel Gorman Acked-by: David Rientjes Reviewed-by: Minchan Kim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 88153b5ca6da2602bac3f4185d9732f1547b8480 Author: Alan Stern Date: Mon Jul 9 11:09:21 2012 -0400 PCI: EHCI: fix crash during suspend on ASUS computers commit dbf0e4c7257f8d684ec1a3c919853464293de66e upstream. Quite a few ASUS computers experience a nasty problem, related to the EHCI controllers, when going into system suspend. It was observed that the problem didn’t occur if the controllers were not put into the D3 power state before starting the suspend, and commit 151b61284776be2d6f02d48c23c3625678960b97 (USB: EHCI: fix crash during suspend on ASUS computers) was created to do this. It turned out this approach messed up other computers that didn’t have the problem – it prevented USB wakeup from working. Consequently commit c2fb8a3fa25513de8fedb38509b1f15a5bbee47b (USB: add NO_D3_DURING_SLEEP flag and revert 151b61284776be2) was merged; it reverted the earlier commit and added a whitelist of known good board names. Now we know the actual cause of the problem. Thanks to AceLan Kao for tracking it down. According to him, an engineer at ASUS explained that some of their BIOSes contain a bug that was added in an attempt to work around a problem in early versions of Windows. When the computer goes into S3 suspend, the BIOS tries to verify that the EHCI controllers were first quiesced by the OS. Nothing’s wrong with this, but the BIOS does it by checking that the PCI COMMAND registers contain 0 without checking the controllers’ power state. If the register isn’t 0, the BIOS assumes the controller needs to be quiesced and tries to do so. This involves making various MMIO accesses to the controller, which don’t work very well if the controller is already in D3. The end result is a system hang or memory corruption. Since the value in the PCI COMMAND register doesn’t matter once the controller has been suspended, and since the value will be restored anyway when the controller is resumed, we can work around the BIOS bug simply by setting the register to 0 during system suspend. This patch (as1590) does so and also reverts the second commit mentioned above, which is now unnecessary. In theory we could do this for every PCI device. However to avoid introducing new problems, the patch restricts itself to EHCI host controllers. Finally the affected systems can suspend with USB wakeup working properly. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=37632 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=42728 Based-on-patch-by: AceLan Kao Signed-off-by: Alan Stern Tested-by: Dâniel Fraga Tested-by: Javier Marcet Tested-by: Andrey Rahmatullin Tested-by: Oleksij Rempel Tested-by: Pavel Pisa Acked-by: Bjorn Helgaas Acked-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 33c050f877ba0d95c43a2bc81f3e6870fa8b0a6b Author: NeilBrown Date: Mon Jul 9 11:34:13 2012 +1000 md/raid1: fix use-after-free bug in RAID1 data-check code. commit 2d4f4f3384d4ef4f7c571448e803a1ce721113d5 upstream. This bug has been present ever since data-check was introduce in 2.6.16. However it would only fire if a data-check were done on a degraded array, which was only possible if the array has 3 or more devices. This is certainly possible, but is quite uncommon. Since hot-replace was added in 3.3 it can happen more often as the same condition can arise if not all possible replacements are present. The problem is that as soon as we submit the last read request, the ‘r1_bio’ structure could be freed at any time, so we really should stop looking at it. If the last device is being read from we will stop looking at it. However if the last device is not due to be read from, we will still check the bio pointer in the r1_bio, but the r1_bio might already be free. So use the read_targets counter to make sure we stop looking for bios to submit as soon as we have submitted them all. This fix is suitable for any -stable kernel since 2.6.16. Reported-by: Arnold Schulz Signed-off-by: NeilBrown [bwh: Backported to 3.2: no doubling of conf->raid_disks; we don’t have hot-replace support] Signed-off-by: Ben Hutchings commit cb480c94c8e89014cdb54201a413bf9d36f6cd41 Author: Dan Williams Date: Fri Jun 22 10:52:34 2012 -0700 libsas: fix taskfile corruption in sas_ata_qc_fill_rtf commit 6ef1b512f4e6f936d89aa20be3d97a7ec7c290ac upstream. fill_result_tf() grabs the taskfile flags from the originating qc which sas_ata_qc_fill_rtf() promptly overwrites. The presence of an ata_taskfile in the sata_device makes it tempting to just copy the full contents in sas_ata_qc_fill_rtf(). However, libata really only wants the fis contents and expects the other portions of the taskfile to not be touched by ->qc_fill_rtf. To that end store a fis buffer in the sata_device and use ata_tf_from_fis() like every other ->qc_fill_rtf() implementation. Reported-by: Praveen Murali Tested-by: Praveen Murali Signed-off-by: Dan Williams Signed-off-by: James Bottomley [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 1e1cdddbadc5f3044882127f3931a19f48b1a69b Author: Shinya Kuribayashi Date: Sat Jul 7 13:37:42 2012 +0300 hwspinlock/core: use global ID to register hwspinlocks on multiple devices commit 476a7eeb60e70ddab138e7cb4bc44ef5ac20782e upstream. Commit 300bab9770 (hwspinlock/core: register a bank of hwspinlocks in a single API call, 2011-09-06) introduced 'hwspin_lock_register_single()' to register numerous (a bank of) hwspinlock instances in a single API, 'hwspin_lock_register()'. At which time, 'hwspin_lock_register()' accidentally passes ‘local IDs’ to 'hwspin_lock_register_single()', despite that …_single() requires ‘global IDs’ to register hwspinlocks. We have to convert into global IDs by supplying the missing 'base_id’. Signed-off-by: Shinya Kuribayashi [ohad: fix error path of hwspin_lock_register, too] Signed-off-by: Ohad Ben-Cohen Signed-off-by: Ben Hutchings commit c4c5d62cb86eedb8aa4b87b2b8ae9f2b909b0b5b Author: Santosh Nayak Date: Sat Jun 23 07:59:54 2012 -0300 dvb-core: Release semaphore on error path dvb_register_device() commit 82163edcdfa4eb3d74516cc8e9f38dd3d039b67d upstream. There is a missing "up_write()" here. Semaphore should be released before returning error value. Signed-off-by: Santosh Nayak Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Ben Hutchings commit f01e43b25ec9d9b682ed5204bb9077bafdbb7abc Author: Herton Ronaldo Krzesinski Date: Wed May 16 16:21:52 2012 -0300 mtd: nandsim: don’t open code a do_div helper commit 596fd46268634082314b3af1ded4612e1b7f3f03 upstream. We don’t need to open code the divide function, just use div_u64 that already exists and do the same job. While this is a straightforward clean up, there is more to that, the real motivation for this. While building on a cross compiling environment in armel, using gcc 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5), I was getting the following build error: ERROR: “__aeabi_uldivmod” [drivers/mtd/nand/nandsim.ko] undefined! After investigating with objdump and hand built assembly version generated with the compiler, I narrowed __aeabi_uldivmod as being generated from the divide function. When nandsim.c is built with -fno-inline-functions-called-once, that happens when CONFIG_DEBUG_SECTION_MISMATCH is enabled, the do_div optimization in arch/arm/include/asm/div64.h doesn’t work as expected with the open coded divide function: even if the do_div we are using doesn’t have a constant divisor, the compiler still includes the else parts of the optimized do_div macro, and translates the divisions there to use __aeabi_uldivmod, instead of only calling __do_div_asm -> __do_div64 and optimizing/removing everything else out. So to reproduce, gcc 4.6 plus CONFIG_DEBUG_SECTION_MISMATCH=y and CONFIG_MTD_NAND_NANDSIM=m should do it, building on armel. After this change, the compiler does the intended thing even with -fno-inline-functions-called-once, and optimizes out as expected the constant handling in the optimized do_div on arm. As this also avoids a build issue, I’m marking for Stable, as I think is applicable for this case. Signed-off-by: Herton Ronaldo Krzesinski Acked-by: Nicolas Pitre Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Ben Hutchings commit ec299e27c68a3481017762def2e5f2f73d9ac43c Author: Bjørn Mork Date: Mon Jul 2 10:33:14 2012 +0200 USB: cdc-wdm: fix lockup on error in wdm_read commit b086b6b10d9f182cd8d2f0dcfd7fd11edba93fc9 upstream. Clear the WDM_READ flag on empty reads to avoid running forever in an infinite tight loop, causing lockups: Jul 1 21:58:11 nemi kernel: [ 3658.898647] qmi_wwan 2-1:1.2: Unexpected error -71 Jul 1 21:58:36 nemi kernel: [ 3684.072021] BUG: soft lockup - CPU#0 stuck for 23s! [qmi.pl:12235] Jul 1 21:58:36 nemi kernel: [ 3684.072212] CPU 0 Jul 1 21:58:36 nemi kernel: [ 3684.072355] Jul 1 21:58:36 nemi kernel: [ 3684.072367] Pid: 12235, comm: qmi.pl Tainted: P O 3.5.0-rc2+ #13 LENOVO 2776LEG/2776LEG Jul 1 21:58:36 nemi kernel: [ 3684.072383] RIP: 0010:[] [] spin_unlock_irq+0x8/0xc [cdc_wdm] Jul 1 21:58:36 nemi kernel: [ 3684.072388] RSP: 0018:ffff88022dca1e70 EFLAGS: 00000282 Jul 1 21:58:36 nemi kernel: [ 3684.072393] RAX: ffff88022fc3f650 RBX: ffffffff811c56f7 RCX: 00000001000ce8c1 Jul 1 21:58:36 nemi kernel: [ 3684.072398] RDX: 0000000000000010 RSI: 000000000267d810 RDI: ffff88022fc3f650 Jul 1 21:58:36 nemi kernel: [ 3684.072403] RBP: ffff88022dca1eb0 R08: ffffffffa063578e R09: 0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072407] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000002 Jul 1 21:58:36 nemi kernel: [ 3684.072412] R13: 0000000000000246 R14: ffffffff00000002 R15: ffff8802281d8c88 Jul 1 21:58:36 nemi kernel: [ 3684.072418] FS: 00007f666a260700(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 1 21:58:36 nemi kernel: [ 3684.072428] CR2: 000000000270d9d8 CR3: 000000022e865000 CR4: 00000000000007f0 Jul 1 21:58:36 nemi kernel: [ 3684.072433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 1 21:58:36 nemi kernel: [ 3684.072444] Process qmi.pl (pid: 12235, threadinfo ffff88022dca0000, task ffff88022ff76380) Jul 1 21:58:36 nemi kernel: [ 3684.072448] Stack: Jul 1 21:58:36 nemi kernel: [ 3684.072458] ffffffffa063592e 0000000100020000 ffff88022fc3f650 ffff88022fc3f6a8 Jul 1 21:58:36 nemi kernel: [ 3684.072466] 0000000000000200 0000000100000000 000000000267d810 0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072475] 0000000000000000 ffff880212cfb6d0 0000000000000200 ffff880212cfb6c0 Jul 1 21:58:36 nemi kernel: [ 3684.072479] Call Trace: Jul 1 21:58:36 nemi kernel: [ 3684.072489] [] ? wdm_read+0x1a0/0x263 [cdc_wdm] Jul 1 21:58:36 nemi kernel: [ 3684.072500] [] ? vfs_read+0xa1/0xfb Jul 1 21:58:36 nemi kernel: [ 3684.072509] [] ? alarm_setitimer+0x35/0x64 Jul 1 21:58:36 nemi kernel: [ 3684.072517] [] ? sys_read+0x45/0x6e Jul 1 21:58:36 nemi kernel: [ 3684.072525] [] ? system_call_fastpath+0x16/0x1b Jul 1 21:58:36 nemi kernel: [ 3684.072557] Code: <66> 66 90 c3 83 ff ed 89 f8 74 16 7f 06 83 ff a1 75 0a c3 83 ff f4 The WDM_READ flag is normally cleared by wdm_int_callback before resubmitting the read urb, and set by wdm_in_callback when this urb returns with data or an error. But a crashing device may cause both a read error and cancelling all urbs. Make sure that the flag is cleared by wdm_read if the buffer is empty. We don’t clear the flag on errors, as there may be pending data in the buffer which should be processed. The flag will instead be cleared on the next wdm_read call. Signed-off-by: Bjørn Mork Acked-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 8d8052f842113d653a0129ed78a1ebdb7dfb62dd Author: Gaosen Zhang Date: Thu Jul 5 21:49:00 2012 +0800 USB: option: Add MEDIATEK product ids commit aacef9c561a693341566a6850c451ce3df68cb9a upstream. Signed-off-by: Gaosen Zhang Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit c3bc4100aadffb067a9e7fa96c3ed03a34e238ff Author: Bjørn Mork Date: Mon Jul 2 19:53:55 2012 +0200 USB: option: add ZTE MF60 commit 8e16e33c168a6efd0c9f7fa9dd4c1e1db9a74553 upstream. Switches into a composite device by ejecting the initial driver CD. The four interfaces are: QCDM, AT, QMI/wwan and mass storage. Let this driver manage the two serial interfaces: T: Bus=02 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 28 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=19d2 ProdID=1402 Rev= 0.00 S: Manufacturer=ZTE,Incorporated S: Product=ZTE WCDMA Technologies MSM S: SerialNumber=xxxxx C:* #Ifs= 4 Cfg#= 1 Atr=c0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Bjørn Mork Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit d9a5888a4162d8e06eb909d31cbe5c779c8b67dc Author: Peter Zijlstra Date: Fri Jun 22 15:52:09 2012 +0200 sched/nohz: Rewrite and fix load-avg computation – again commit 5167e8d5417bf5c322a703d2927daec727ea40dd upstream. Thanks to Charles Wang for spotting the defects in the current code: - If we go idle during the sample window – after sampling, we get a negative bias because we can negate our own sample. - If we wake up during the sample window we get a positive bias because we push the sample to a known active period. So rewrite the entire nohz load-avg muck once again, now adding copious documentation to the code. Reported-and-tested-by: Doug Smythies Reported-and-tested-by: Charles Wang Signed-off-by: Peter Zijlstra Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/1340373782.18025.74.camel@twins [ minor edits ] Signed-off-by: Ingo Molnar [bwh: Backported to 3.2: adjust filenames, context] Signed-off-by: Ben Hutchings commit 43fc6bce37ff39ab5010afadf1ea86e7cc9c2b37 Author: Mark Brown Date: Sat Jun 9 11:07:56 2012 +0800 gpiolib: wm8994: Pay attention to the value set when enabling as output commit 8cd578b6e28693f357867a77598a88ef3deb6b39 upstream. Not paying attention to the value being set is a bad thing because it means that we’ll not set the hardware up to reflect what was requested. Not setting the hardware up to reflect what was requested means that the caller won’t get the results they wanted. Signed-off-by: Mark Brown Signed-off-by: Linus Walleij Signed-off-by: Ben Hutchings commit db46b26cf3a124bd5beb357e48c0e2ea6e2f1725 Author: Stanislaw Ledwon Date: Mon Jun 18 15:20:00 2012 +0200 usb: Add support for root hub port status CAS commit 8bea2bd37df08aaa599aa361a9f8b836ba98e554 upstream. The host controller port status register supports CAS (Cold Attach Status) bit. This bit could be set when USB3.0 device is connected when system is in Sx state. When the system wakes to S0 this port status with CAS bit is reported and this port can’t be used by any device. When CAS bit is set the port should be reset by warm reset. This was not supported by xhci driver. The issue was found when pendrive was connected to suspended platform. The link state of “Compliance Mode” was reported together with CAS bit. This link state was also not supported by xhci and core/hub.c. The CAS bit is defined only for xhci root hub port and it is not supported on regular hubs. The link status is used to force warm reset on port. Make the USB core issue a warm reset when port is in ether the ‘inactive’ or ‘compliance mode’. Change the xHCI driver to report ‘compliance mode’ when the CAS is set. This force warm reset on the root hub port. This patch should be backported to stable kernels as old as 3.2, that contain the commit 10d674a82e553cb8a1f41027bb3c3e309b3f6804 “USB: When hot reset for USB3 fails, try warm reset.” Signed-off-by: Stanislaw Ledwon Signed-off-by: Sarah Sharp Acked-by: Andiry Xu Signed-off-by: Ben Hutchings commit f2d391c109ecc377d55f7865ac8ea26ac8921ab7 Author: Joerg Roedel Date: Thu Jun 21 14:52:40 2012 +0200 iommu/amd: Initialize dma_ops for hotplug and sriov devices commit ac1534a55d1e87d59a21c09c570605933b551480 upstream. When a device is added to the system at runtime the AMD IOMMU driver initializes the necessary data structures to handle translation for it. But it forgets to change the per-device dma_ops to point to the AMD IOMMU driver. So mapping actually never happens and all DMA accesses end in an IO_PAGE_FAULT. Fix this. Reported-by: Stefan Assmann Signed-off-by: Joerg Roedel [bwh: Backported to 3.2: - Adjust context - Use global iommu_pass_through; there is no per-device pass_through] Signed-off-by: Ben Hutchings commit f17963661759dfde573f9d7993cf5aa5a97057c3 Author: Shuah Khan Date: Wed Jun 6 10:50:06 2012 -0600 iommu/amd: Fix missing iommu_shutdown initialization in passthrough mode commit f2f12b6fc032c7b1419fd6db84e2868b5f05a878 upstream. The iommu_shutdown callback is not initialized when the AMD IOMMU driver runs in passthrough mode. Fix that by moving the callback initialization before the check for passthrough mode. Signed-off-by: Shuah Khan Signed-off-by: Joerg Roedel [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 11a5b0b59b9b38e71fa45bd9739a6f4ffa8ce61a Author: Jason Baron Date: Wed Apr 25 16:01:47 2012 -0700 epoll: clear the tfile_check_list on -ELOOP commit 13d518074a952d33d47c428419693f63389547e9 upstream. An epoll_ctl(,EPOLL_CTL_ADD,) operation can return '-ELOOP’ to prevent circular epoll dependencies from being created. However, in that case we do not properly clear the 'tfile_check_list’. Thus, add a call to clear_tfile_check_list() for the -ELOOP case. Signed-off-by: Jason Baron Reported-by: Yurij M. Plotnikov Cc: Nelson Elhage Cc: Davide Libenzi Tested-by: Alexandra N. Kossovsky Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit f45c9a6eec20cd712421c442785e7a4e9215a230 Author: Jan Kara Date: Fri Jun 15 12:52:46 2012 +0200 scsi: Silence unnecessary warnings about ioctl to partition commit 6d9359280753d2955f86d6411047516a9431eb51 upstream. Sometimes, warnings about ioctls to partition happen often enough that they form majority of the warnings in the kernel log and users complain. In some cases warnings are about ioctls such as SG_IO so it’s not good to get rid of the warnings completely as they can ease debugging of userspace problems when ioctl is refused. Since I have seen warnings from lots of commands, including some proprietary userspace applications, I don’t think disallowing the ioctls for processes with CAP_SYS_RAWIO will happen in the near future if ever. So lets just stop warning for processes with CAP_SYS_RAWIO for which ioctl is allowed. CC: Paolo Bonzini CC: James Bottomley CC: [email protected] Acked-by: Paolo Bonzini Signed-off-by: Jan Kara Signed-off-by: Jens Axboe [bwh: Backported to 3.2: use ENOTTY, not ENOIOCTLCMD] Signed-off-by: Ben Hutchings commit 0f3cbc35d2097d2c655789dd4996e7b87bdb5d34 Author: Avi Kivity Date: Sun Apr 22 17:02:11 2012 +0300 KVM: Fix buffer overflow in kvm_set_irq() commit f2ebd422f71cda9c791f76f85d2ca102ae34a1ed upstream. kvm_set_irq() has an internal buffer of three irq routing entries, allowing connecting a GSI to three IRQ chips or on MSI. However setup_routing_entry() does not properly enforce this, allowing three irqchip routes followed by an MSI route to overflow the buffer. Fix by ensuring that an MSI entry is added to an empty list. Signed-off-by: Avi Kivity Signed-off-by: Ben Hutchings commit 1be535a022a9c0b9a55cab2993ee29e05c4ead0b Author: Jason Wang Date: Wed May 2 11:42:15 2012 +0800 macvtap: zerocopy: validate vectors before building skb commit b92946e2919134ebe2a4083e4302236295ea2a73 upstream. There’re several reasons that the vectors need to be validated: - Return error when caller provides vectors whose num is greater than UIO_MAXIOV. - Linearize part of skb when userspace provides vectors grater than MAX_SKB_FRAGS. - Return error when userspace provides vectors whose total length may exceed - MAX_SKB_FRAGS * PAGE_SIZE. Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit cc64214d05941d09bb9f80051a1949a7223b8ab7 Author: Jason Wang Date: Wed May 2 11:42:06 2012 +0800 macvtap: zerocopy: set SKBTX_DEV_ZEROCOPY only when skb is built successfully commit 01d6657b388438def19c8baaea28e742b6ed32ec upstream. Current the SKBTX_DEV_ZEROCOPY is set unconditionally after zerocopy_sg_from_iovec(), this would lead NULL pointer when macvtap fails to build zerocopy skb because destructor_arg was not initialized. Solve this by set this flag after the skb were built successfully. Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit e9214638ead441c15b48f40c3b459aede77c2662 Author: Jason Wang Date: Wed May 2 11:41:58 2012 +0800 macvtap: zerocopy: put page when fail to get all requested user pages commit 02ce04bb3d28c3333231f43bca677228dbc686fe upstream. When get_user_pages_fast() fails to get all requested pages, we could not use kfree_skb() to free it as it has not been put in the skb fragments. So we need to call put_page() instead. Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit 14180f501e674f1bbe352d37c962ede4d8f679d4 Author: Jason Wang Date: Wed May 2 11:41:44 2012 +0800 macvtap: zerocopy: fix truesize underestimation commit 4ef67ebedffa44ed9939b34708ac2fee06d2f65f upstream. As the skb fragment were pinned/built from user pages, we should account the page instead of length for truesize. Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit c1b5b21b540f22a8e008d30545c044a6c949b47b Author: Jason Wang Date: Wed May 2 11:41:30 2012 +0800 macvtap: zerocopy: fix offset calculation when building skb commit 3afc9621f15701c557e60f61eba9242bac2771dd upstream. This patch fixes the offset calculation when building skb: - offset1 were used as skb data offset not vector offset - reset offset to zero only when we advance to next vector Signed-off-by: Jason Wang Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit 9046e979413d3a05057ae9995cebf93eed24f80c Author: Trond Myklebust Date: Wed Feb 8 13:39:15 2012 -0500 NFSv4: Further reduce the footprint of the idmapper commit 685f50f9188ac1e8244d0340a9d6ea36b6136cec upstream. Don’t allocate the legacy idmapper tables until we actually need them. Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton [bwh: Backported to 3.2: adjust context in nfs_idmap_delete()] Signed-off-by: Ben Hutchings commit 805292ea678d4855d1f12db5f1c1d1444b92d9d7 Author: Trond Myklebust Date: Tue Feb 7 14:59:05 2012 -0500 NFSv4: Reduce the footprint of the idmapper commit d073e9b541e1ac3f52d72c3a153855d9a9ee3278 upstream. Instead of pre-allocating the storage for all the strings, we can significantly reduce the size of that table by doing the allocation when we do the downcall. Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton [bwh: Backported to 3.2: adjust context in nfs_idmap_delete()] Signed-off-by: Ben Hutchings commit 8fc536fc7a502841b732edfc6d5a66f3c138132d Author: David Gibson Date: Wed Mar 21 16:34:12 2012 -0700 hugepages: fix use after free bug in “quota” handling commit 90481622d75715bfcb68501280a917dbfe516029 upstream. hugetlbfs_{get,put}_quota() are badly named. They don’t interact with the general quota handling code, and they don’t much resemble its behaviour. Rather than being about maintaining limits on on-disk block usage by particular users, they are instead about maintaining limits on in-memory page usage (including anonymous MAP_PRIVATE copied-on-write pages) associated with a particular hugetlbfs filesystem instance. Worse, they work by having callbacks to the hugetlbfs filesystem code from the low-level page handling code, in particular from free_huge_page(). This is a layering violation of itself, but more importantly, if the kernel does a get_user_pages() on hugepages (which can happen from KVM amongst others), then the free_huge_page() can be delayed until after the associated inode has already been freed. If an unmount occurs at the wrong time, even the hugetlbfs superblock where the “quota” limits are stored may have been freed. Andrew Barry proposed a patch to fix this by having hugepages, instead of storing a pointer to their address_space and reaching the superblock from there, had the hugepages store pointers directly to the superblock, bumping the reference count as appropriate to avoid it being freed. Andrew Morton rejected that version, however, on the grounds that it made the existing layering violation worse. This is a reworked version of Andrew’s patch, which removes the extra, and some of the existing, layering violation. It works by introducing the concept of a hugepage “subpool” at the lower hugepage mm layer - that is a finite logical pool of hugepages to allocate from. hugetlbfs now creates a subpool for each filesystem instance with a page limit set, and a pointer to the subpool gets added to each allocated hugepage, instead of the address_space pointer used now. The subpool has its own lifetime and is only freed once all pages in it _and_ all other references to it (i.e. superblocks) are gone. subpools are optional - a NULL subpool pointer is taken by the code to mean that no subpool limits are in effect. Previous discussion of this bug found in: "Fix refcounting in hugetlbfs quota handling.". See: https://lkml.org/lkml/2011/8/11/28 or http://marc.info/?l=linux-mm&m=126928970510627&w=1 v2: Fixed a bug spotted by Hillf Danton, and removed the extra parameter to alloc_huge_page() - since it already takes the vma, it is not necessary. Signed-off-by: Andrew Barry Signed-off-by: David Gibson Cc: Hugh Dickins Cc: Mel Gorman Cc: Minchan Kim Cc: Hillf Danton Cc: Paul Mackerras Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 3.2: adjust context to apply after commit c50ac050811d6485616a193eb0f37bfbd191cc89 ‘hugetlb: fix resv_map leak in error path’, backported in 3.2.20] Signed-off-by: Ben Hutchings commit 42bc7c990b1a040fbfc94b8cad7fd8f626040b51 Author: Ben Hutchings Date: Wed Jan 4 21:22:51 2012 -0500 ext4: Report max_batch_time option correctly commit 1d526fc91bea04ee35b7599bf8b82f86c0aaf46c upstream. Currently the value reported for max_batch_time is really the value of min_batch_time. Reported-by: Russell Coker Signed-off-by: Ben Hutchings commit 3116aea632aa83185953d9ca52d216dd524cb845 Author: William Dauchy Date: Wed Mar 14 12:32:04 2012 +0100 NFSv4: Rate limit the state manager for lock reclaim warning messages commit 96dcadc2fdd111dca90d559f189a30c65394451a upstream. Adding rate limit on `Lock reclaim failed` messages since it could fill up system logs Signed-off-by: William Dauchy Signed-off-by: Trond Myklebust [bwh: Backported to 3.2: add the ‘NFS:’ prefix at the same time] Signed-off-by: Ben Hutchings commit 1c095fd3e95b6edf46016ac77ad78a61522ce8ed Author: Eldad Zack Date: Sun Apr 22 00:48:04 2012 +0200 brcmsmac: “INTERMEDIATE but not AMPDU” only when tracing commit 6ead629b27269c553c9092c47cd8f5ab0309ee3b upstream. I keep getting the following messages on the log buffer: [ 2167.097507] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU [ 2281.331305] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU [ 2281.332539] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU [ 2329.876605] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU [ 2329.877354] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU [ 2462.280756] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU [ 2615.651689] ieee80211 phy0: brcms_c_dotxstatus: INTERMEDIATE but not AMPDU From the code comment I understand that this something that can - and does, quite frequently - happen. Signed-off-by: Eldad Zack Acked-by: Franky Lin Signed-off-by: John W. Linville Signed-off-by: Ben Hutchings commit 3838461c8d7479c84129c2744a3fab53e269e340 Author: Lucas De Marchi Date: Tue Jan 17 14:50:51 2012 -0200 kbuild: do not check for ancient modutils tools commit 620c231c7a7f48745094727bb612f6321cfc8844 upstream. scripts/depmod.sh checks for the output of '-V’ expecting that it has module-init-tools in it. It’s a hack to prevent users from using modutils instead of module-init-tools, that only works with 2.4.x kernels. This however prints an annoying warning for kmod tool, that is currently replacing module-init-tools. Rather than putting another check for kmod’s version, just remove it since users of 2.4.x kernel are unlikely to upgrade to 3.x, and if they do, let depmod fail in that case because they should know what they are doing. Signed-off-by: Lucas De Marchi Acked-by: WANG Cong Acked-By: Kay Sievers Signed-off-by: Michal Marek Signed-off-by: Ben Hutchings commit 1e084e62f89d60c15dc4fc282a68309e793daa1b Author: Eugeni Dodonov Date: Thu Feb 23 23:57:06 2012 -0200 drm/i915: fix operator precedence when enabling RC6p commit c0e2ee1bc0cf82eec89e26b7afe7e4db0561b7d9 upstream. As noticed by Torsten Kaiser, the operator precedence can play tricks with us here. CC: Dave Airlie Signed-off-by: Eugeni Dodonov Signed-off-by: Jesse Barnes Signed-off-by: Ben Hutchings commit c90a4d1a113285fedbab465d2fd398f31c294170 Author: Eugeni Dodonov Date: Tue Feb 14 11:44:48 2012 -0200 drm/i915: do not enable RC6p on Sandy Bridge commit 1c8ecf80fdee4e7b23a9e7da7ff9bd59ba2dcf96 upstream. With base on latest findings, RC6p seems to be respondible for RC6-related issues on Sandy Bridge platform. To work-around those issues, the previous solution was to completely disable RC6 on Sandy Bridge for the past few releases, even if plain RC6 was not giving any issues. What this patch does is preventing RC6p from being enabled on Sandy Bridge even if users enable RC6 via a kernel parameter. So it won’t change the defaults in any way, but will ensure that if users do enable RC6 manually it won’t break their machines by enabling this extra state. Proper fix for this (enabling specific RC6 states according to the GPU generation) were proposed for the -next kernel, but we are too late in the release process now to pick such changes. Acked-by: Keith Packard Signed-off-by: Eugeni Dodonov Signed-off-by: Jesse Barnes Signed-off-by: Ben Hutchings commit e31a54c572423d13f14a904b87a41c93a011af78 Author: Stanislav Yakovlev Date: Tue Apr 10 21:44:47 2012 -0400 net/wireless: ipw2x00: add supported cipher suites to wiphy initialization commit a141e6a0097118bb35024485f1faffc0d9042f5c upstream. Driver doesn’t report its supported cipher suites through cfg80211 interface. It still uses wext interface and probably will not work through nl80211, but will at least correctly advertise supported features. Bug was reported by Omar Siam. https://bugzilla.kernel.org/show_bug.cgi?id=43049 Signed-off-by: Stanislav Yakovlev Signed-off-by: John W. Linville Signed-off-by: Ben Hutchings commit e6a46daaf4287c4b6dc00d19868aa8e9f160eb5b Author: Stanislaw Gruszka Date: Wed May 16 11:06:21 2012 +0200 rtl8187: ->brightness_set can not sleep commit 0fde0a8cfd0ede7f310d6a681c8e5a7cb3e32406 upstream. Fix: BUG: sleeping function called from invalid context at kernel/workqueue.c:2547 in_atomic(): 1, irqs_disabled(): 0, pid: 629, name: wpa_supplicant 2 locks held by wpa_supplicant/629: #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x14/0x20 #1: (&trigger->leddev_list_lock){.+.?..}, at: [] led_trigger_event+0x21/0x80 Pid: 629, comm: wpa_supplicant Not tainted 3.3.0-0.rc3.git5.1.fc17.i686 Call Trace: [] __might_sleep+0x126/0x1d0 [] wait_on_work+0x2c/0x1d0 [] __cancel_work_timer+0x6a/0x120 [] cancel_delayed_work_sync+0x10/0x20 [] rtl8187_led_brightness_set+0x82/0xf0 [rtl8187] [] led_trigger_event+0x5c/0x80 [] ieee80211_led_radio+0x1d/0x40 [mac80211] [] ieee80211_stop_device+0x13/0x230 [mac80211] Removing _sync is ok, because if led_on work is currently running it will be finished before led_off work start to perform, since they are always queued on the same mac80211 local->workqueue. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=795176 Signed-off-by: Stanislaw Gruszka Acked-by: Larry Finger Acked-by: Hin-Tak Leung Signed-off-by: John W. Linville Signed-off-by: Ben Hutchings commit 3aa9b60af8f3bc95ad279ebfac202abf97ade173 Author: Matt Carlson Date: Thu Jun 7 12:56:54 2012 +0000 tg3: Apply short DMA frag workaround to 5906 commit b7abee6ef888117f92db370620ebf116a38e3f4d upstream. 5906 devices also need the short DMA fragment workaround. This patch makes the necessary change. Signed-off-by: Matt Carlson Tested-by: Christian Kujau Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit e6364fb003c0bc98c5fcde51aac6fd3b6a1337c3 Author: Eric Dumazet Date: Fri Dec 2 23:41:42 2011 +0000 tcp: drop SYN+FIN messages commit fdf5af0daf8019cec2396cdef8fb042d80fe71fa upstream. Denys Fedoryshchenko reported that SYN+FIN attacks were bringing his linux machines to their limits. Dont call conn_request() if the TCP flags includes SYN flag Reported-by: Denys Fedoryshchenko Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit c0159c780e8d42309d04e83271986274d3880826 Author: Shaohua Li Date: Tue Jul 3 15:57:19 2012 +1000 raid5: delayed stripe fix commit fab363b5ff502d1b39ddcfec04271f5858d9f26e upstream. There isn’t locking setting STRIPE_DELAYED and STRIPE_PREREAD_ACTIVE bits, but the two bits have relationship. A delayed stripe can be moved to hold list only when preread active stripe count is below IO_THRESHOLD. If a stripe has both the bits set, such stripe will be in delayed list and preread count not 0, which will make such stripe never leave delayed list. Signed-off-by: Shaohua Li Signed-off-by: NeilBrown Signed-off-by: Ben Hutchings commit 60b179a2ff59ad09e8217e9dd7266fb23a080ded Author: Corentin Chary Date: Sat Nov 26 11:00:10 2011 +0100 samsung-laptop: make the dmi check less strict commit 3be324a94df0c3f032178d04549dbfbf6cccb09a upstream. This enable the driver for everything that look like a laptop and is from vendor "SAMSUNG ELECTRONICS CO., LTD.". Note that laptop supported by samsung-q10 seem to have a different vendor strict. Also remove every log output until we know that we have a SABI interface (except if the driver is forced to load, or debug is enabled). Keeping a whitelist of laptop with a model granularity is something that can’t work without close vendor cooperation (and we don’t have that). Signed-off-by: Corentin Chary Acked-by: Greg Kroah-Hartman Signed-off-by: Matthew Garrett [bwh: Backported to 3.2: - Adjust context - Drop changes relating to ACPI video] Signed-off-by: Ben Hutchings
Related news
The Bosch Ethernet switch PRA-ES8P2S with software version 1.01.05 runs its web server with root privilege. In combination with CVE-2022-23534 this could give an attacker root access to the switch.