Skip to content

Syncing tree#1

Merged
rabeeh merged 6 commits intorabeeh:linux-linaro-lsk-mx6from
linux4kix:linux-linaro-lsk-mx6
Mar 2, 2014
Merged

Syncing tree#1
rabeeh merged 6 commits intorabeeh:linux-linaro-lsk-mx6from
linux4kix:linux-linaro-lsk-mx6

Conversation

@rabeeh
Copy link
Copy Markdown
Owner

@rabeeh rabeeh commented Mar 2, 2014

No description provided.

linux4kix and others added 6 commits February 26, 2014 13:14
…DMI phy power on""

This reverts commit 6ecb091.

Boundary Devices has the proper fix.
If enabled too early, a flood of interrupts can happen, and as
console_lock is held, you cannot see any messages being printed.

Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
*WARNING HACK*  This is a terrible workaround to hack the
drivername to work with existing XBMC binaries that check
modalias to match a string to allow it to run on IMX hardware.

This needs to be fixed in userspace.
add a phandle to the iram node so sdma can use iram to save
power and reduce latency when playing back audio.
rabeeh added a commit that referenced this pull request Mar 2, 2014
@rabeeh rabeeh merged commit acc9fe7 into rabeeh:linux-linaro-lsk-mx6 Mar 2, 2014
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request May 1, 2014
The order of unregistering codecs and cards seems to be wrong.  This
patch will unregister the card which detaches from the codecs and then
after this unregister the codecs.

Can be triggered by:

echo sound-spdif.26 > /sys/bus/platform/drivers/imx-spdif/unbind

------------[ cut here ]------------
WARNING: at /home/lund/src/linux/fs/sysfs/file.c:498 sysfs_attr_ns+0xa8/0xb0()
sysfs: kobject  without dirent
Modules linked in: brcmfmac brcmutil mxc_v4l2_capture ipu_bg_overlay_sdc ipu_still ipu_prp_enc evbug ipu_csi_enc ipu_fg_overlay_sdc
CPU: 1 PID: 350 Comm: bash Not tainted 3.10.30-02217-g3e038ee linux4kix#9
[<800145d0>] (unwind_backtrace) from [<800115b8>] (show_stack+0x10/0x14)
[<800115b8>] (show_stack) from [<80026824>] (warn_slowpath_common+0x54/0x6c)
[<80026824>] (warn_slowpath_common) from [<8002686c>] (warn_slowpath_fmt+0x30/0x40)
[<8002686c>] (warn_slowpath_fmt) from [<8011bdb0>] (sysfs_attr_ns+0xa8/0xb0)
[<8011bdb0>] (sysfs_attr_ns) from [<8011be64>] (sysfs_remove_file+0x18/0x40)
[<8011be64>] (sysfs_remove_file) from [<804a10a0>] (snd_soc_dapm_free+0x18/0x200)
[<804a10a0>] (snd_soc_dapm_free) from [<8049d014>] (soc_remove_codec+0x2c/0x90)
[<8049d014>] (soc_remove_codec) from [<8049e7f4>] (soc_remove_dai_links+0x320/0x370)
[<8049e7f4>] (soc_remove_dai_links) from [<8049fd94>] (soc_cleanup_card_resources+0x7c/0xb0)
[<8049fd94>] (soc_cleanup_card_resources) from [<8049fddc>] (snd_soc_unregister_card+0x14/0x1c)
[<8049fddc>] (snd_soc_unregister_card) from [<804ae230>] (imx_spdif_audio_remove+0x3c/0x44)
[<804ae230>] (imx_spdif_audio_remove) from [<802d6f64>] (platform_drv_remove+0x18/0x1c)
[<802d6f64>] (platform_drv_remove) from [<802d58ac>] (__device_release_driver+0x70/0xcc)
[<802d58ac>] (__device_release_driver) from [<802d5924>] (device_release_driver+0x1c/0x28)
[<802d5924>] (device_release_driver) from [<802d4930>] (driver_unbind+0x78/0xbc)
[<802d4930>] (driver_unbind) from [<802d4004>] (drv_attr_store+0x20/0x2c)
[<802d4004>] (drv_attr_store) from [<8011c234>] (sysfs_write_file+0x160/0x190)
[<8011c234>] (sysfs_write_file) from [<800c1354>] (vfs_write+0xb4/0x194)
[<800c1354>] (vfs_write) from [<800c18cc>] (SyS_write+0x3c/0x78)
[<800c18cc>] (SyS_write) from [<8000dfc0>] (ret_fast_syscall+0x0/0x30)
---[ end trace aae705555ee5c7cb ]---
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = dc61c000
[00000008] *pgd=6c911831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [rabeeh#1] SMP ARM
Modules linked in: brcmfmac brcmutil mxc_v4l2_capture ipu_bg_overlay_sdc ipu_still ipu_prp_enc evbug ipu_csi_enc ipu_fg_overlay_sdc
CPU: 1 PID: 350 Comm: bash Tainted: G        W    3.10.30-02217-g3e038ee linux4kix#9
task: dc7a8380 ti: dc67a000 task.ti: dc67a000
PC is at soc_remove_codec+0x74/0x90
LR is at soc_remove_dai_links+0x320/0x370
pc : [<8049d05c>]    lr : [<8049e7f4>]    psr: 200f0013
ip : 00000040  fp : dc3326c0
r10: 00000000  r9 : 00000000  r8 : 00100100
r7 : dc4358b0  r6 : dc332300  r5 : dc44710c  r4 : 00000000
r3 : 00000000  r2 : 00100100  r1 : dc45f210  r0 : dc435980
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c53c7d  Table: 6c61c04a  DAC: 00000015
Process bash (pid: 350, stack limit = 0xdc67a238)
Stack: (0xdc67be98 to 0xdc67c000)
be80:                                                       00200200 00200200
bea0: dc45f218 000005b8 dc4358b0 00000001 0000000f 8062b26c dc3ce180 dc448380
bec0: 00000000 8049fd94 dc4358b0 dc435810 80891dc4 8049fddc 00000001 804ae230
bee0: 804ae1f4 dc15f210 80891dc4 802d6f64 802d6f4c 802d58ac dc7a8380 dc15f244
bf00: dc15f210 802d5924 dc7a8380 8086f4e0 dc15f210 802d4930 0000000f 0000000f
bf20: dcb5d280 dcb5d298 dc67bf80 802d400 0000000f 8011c234 dc8706c0 76f03000
bf40: dc67bf80 0000000f 00000000 0000000f 00000000 800c1354 dc6f1600 00000000
bf60: 00000001 00000000 00000000 dc8706c0 76f03000 00000000 0000000f 800c18cc
bf80: 00000000 00000000 00000000 0000000f 76f03000 76df2b38 00000004 8000e144
bfa0: dc67a000 8000dfc0 0000000f 76f03000 00000001 76f03000 0000000f 00000000
bfc0: 0000000f 76f03000 76df2b38 00000004 0000000f 76f03000 0000000f 00000000
bfe0: 00000000 7eb1492c 76d2c234 76d8041c 600f0010 00000001 00000000 00000000
[<8049d05c>] (soc_remove_codec) from [<00200200>] (0x200200)
Code: e5842038 e584303c e5913050 e8bd4010 (e5930008)
---[ end trace aae705555ee5c7cc ]---
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
A long time ago in a galaxy far away....

.. the was a commit made to fix some ilinux specific "fragmented
buffer" log recovery problem:

http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=b29c0bece51da72fb3ff3b61391a391ea54e1603

That problem occurred when a contiguous dirty region of a buffer was
split across across two pages of an unmapped buffer. It's been a
long time since that has been done in XFS, and the changes to log
the entire inode buffers for CRC enabled filesystems has
re-introduced that corner case.

And, of course, it turns out that the above commit didn't actually
fix anything - it just ensured that log recovery is guaranteed to
fail when this situation occurs. And now for the gory details.

xfstest xfs/085 is failing with this assert:

XFS (vdb): bad number of regions (0) in inode log format
XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 1583

Largely undocumented factoid rabeeh#1: Log recovery depends on all log
buffer format items starting with this format:

struct foo_log_format {
	__uint16_t	type;
	__uint16_t	size;
	....

As recoery uses the size field and assumptions about 32 bit
alignment in decoding format items.  So don't pay much attention to
the fact log recovery thinks that it decoding an inode log format
item - it just uses them to determine what the size of the item is.

But why would it see a log format item with a zero size? Well,
luckily enough xfs_logprint uses the same code and gives the same
error, so with a bit of gdb magic, it turns out that it isn't a log
format that is being decoded. What logprint tells us is this:

Oper (130): tid: a0375e1a  len: 28  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 144 (0x90)  len: 16  bmap size: 2  flags: 0x4000
Oper (131): tid: a0375e1a  len: 4096  clientid: TRANS  flags: none
BUF DATA
----------------------------------------------------------------------------
Oper (132): tid: a0375e1a  len: 4096  clientid: TRANS  flags: none
xfs_logprint: unknown log operation type (4e49)
**********************************************************************
* ERROR: data block=2                                                 *
**********************************************************************

That we've got a buffer format item (oper 130) that has two regions;
the format item itself and one dirty region. The subsequent region
after the buffer format item and it's data is them what we are
tripping over, and the first bytes of it at an inode magic number.
Not a log opheader like there is supposed to be.

That means there's a problem with the buffer format item. It's dirty
data region is 4096 bytes, and it contains - you guessed it -
initialised inodes. But inode buffers are 8k, not 4k, and we log
them in their entirety. So something is wrong here. The buffer
format item contains:

(gdb) p /x *(struct xfs_buf_log_format *)in_f
$22 = {blf_type = 0x123c, blf_size = 0x2, blf_flags = 0x4000,
       blf_len = 0x10, blf_blkno = 0x90, blf_map_size = 0x2,
       blf_data_map = {0xffffffff, 0xffffffff, .... }}

Two regions, and a signle dirty contiguous region of 64 bits.  64 *
128 = 8k, so this should be followed by a single 8k region of data.
And the blf_flags tell us that the type of buffer is a
XFS_BLFT_DINO_BUF. It contains inodes. And because it doesn't have
the XFS_BLF_INODE_BUF flag set, that means it's an inode allocation
buffer. So, it should be followed by 8k of inode data.

But we know that the next region has a header of:

(gdb) p /x *ohead
$25 = {oh_tid = 0x1a5e37a0, oh_len = 0x100000, oh_clientid = 0x69,
       oh_flags = 0x0, oh_res2 = 0x0}

and so be32_to_cpu(oh_len) = 0x1000 = 4096 bytes. It's simply not
long enough to hold all the logged data. There must be another
region. There is - there's a following opheader for another 4k of
data that contains the other half of the inode cluster data - the
one we assert fail on because it's not a log format header.

So why is the second part of the data not being accounted to the
correct buffer log format structure? It took a little more work with
gdb to work out that the buffer log format structure was both
expecting it to be there but hadn't accounted for it. It was at that
point I went to the kernel code, as clearly this wasn't a bug in
xfs_logprint and the kernel was writing bad stuff to the log.

First port of call was the buffer item formatting code, and the
discontiguous memory/contiguous dirty region handling code
immediately stood out. I've wondered for a long time why the code
had this comment in it:

                        vecp->i_addr = xfs_buf_offset(bp, buffer_offset);
                        vecp->i_len = nbits * XFS_BLF_CHUNK;
                        vecp->i_type = XLOG_REG_TYPE_BCHUNK;
/*
 * You would think we need to bump the nvecs here too, but we do not
 * this number is used by recovery, and it gets confused by the boundary
 * split here
 *                      nvecs++;
 */
                        vecp++;

And it didn't account for the extra vector pointer. The case being
handled here is that a contiguous dirty region lies across a
boundary that cannot be memcpy()d across, and so has to be split
into two separate operations for xlog_write() to perform.

What this code assumes is that what is written to the log is two
consecutive blocks of data that are accounted in the buf log format
item as the same contiguous dirty region and so will get decoded as
such by the log recovery code.

The thing is, xlog_write() knows nothing about this, and so just
does it's normal thing of adding an opheader for each vector. That
means the 8k region gets written to the log as two separate regions
of 4k each, but because nvecs has not been incremented, the buf log
format item accounts for only one of them.

Hence when we come to log recovery, we process the first 4k region
and then expect to come across a new item that starts with a log
format structure of some kind that tells us whenteh next data is
going to be. Instead, we hit raw buffer data and things go bad real
quick.

So, the commit from 2002 that commented out nvecs++ is just plain
wrong. It breaks log recovery completely, and it would seem the only
reason this hasn't been since then is that we don't log large
contigous regions of multi-page unmapped buffers very often. Never
would be a closer estimate, at least until the CRC code came along....

So, lets fix that by restoring the nvecs accounting for the extra
region when we hit this case.....

.... and there's the problemin log recovery it is apparently working
around:

XFS: Assertion failed: i == item->ri_total, file: fs/xfs/xfs_log_recover.c, line: 2135

Yup, xlog_recover_do_reg_buffer() doesn't handle contigous dirty
regions being broken up into multiple regions by the log formatting
code. That's an easy fix, though - if the number of contiguous dirty
bits exceeds the length of the region being copied out of the log,
only account for the number of dirty bits that region covers, and
then loop again and copy more from the next region. It's a 2 line
fix.

Now xfstests xfs/085 passes, we have one less piece of mystery
code, and one more important piece of knowledge about how to
structure new log format items..

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Got bellow lockdep warning during tests. It is false alarm though.

[ 1184.479097] =============================================
[ 1184.479187] [ INFO: possible recursive locking detected ]
[ 1184.479277] 3.10.0-rc3+ linux4kix#13 Tainted: G         C
[ 1184.479355] ---------------------------------------------
[ 1184.479444] mkdir/2215 is trying to acquire lock:
[ 1184.479521]  (&(&dentry->d_lock)->rlock){+.+...}, at: [<ffffffffa06cc27c>] ll_md_blocking_ast+0x55c/0x655 [lustre]
[ 1184.479801]
but task is already holding lock:
[ 1184.479895]  (&(&dentry->d_lock)->rlock){+.+...}, at: [<ffffffffa06cc1b1>] ll_md_blocking_ast+0x491/0x655 [lustre]
[ 1184.480101]
other info that might help us debug this:
[ 1184.480206]  Possible unsafe locking scenario:

[ 1184.480300]        CPU0
[ 1184.480340]        ----
[ 1184.480380]   lock(&(&dentry->d_lock)->rlock);
[ 1184.480458]   lock(&(&dentry->d_lock)->rlock);
[ 1184.480536]
 *** DEADLOCK ***

[ 1184.480761]  May be due to missing lock nesting notation

[ 1184.480936] 4 locks held by mkdir/2215:
[ 1184.481037]  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff811531a9>] mnt_want_write+0x24/0x4b
[ 1184.481273]  rabeeh#1:  (&type->i_mutex_dir_key#3/1){+.+.+.}, at: [<ffffffff81144fce>] kern_path_create+0x8c/0x144
[ 1184.481513]  rabeeh#2:  (&sb->s_type->i_lock_key#19){+.+...}, at: [<ffffffffa06cc180>] ll_md_blocking_ast+0x460/0x655 [lustre]
[ 1184.481778]  rabeeh#3:  (&(&dentry->d_lock)->rlock){+.+...}, at: [<ffffffffa06cc1b1>] ll_md_blocking_ast+0x491/0x655 [lustre]
[ 1184.482050]

Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Michael L. Semon has been testing CRC patches on a 32 bit system and
been seeing assert failures in the directory code from xfs/080.
Thanks to Michael's heroic efforts with printk debugging, we found
that the problem was that the last free space being left in the
directory structure was too small to fit a unused tag structure and
it was being corrupted and attempting to log a region out of bounds.
Hence the assert failure looked something like:

.....
rabeeh#5 calling xfs_dir2_data_log_unused() 36 32
rabeeh#1 4092 4095 4096
rabeeh#2 8182 8183 4096
XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568

Where rabeeh#1 showed the first region of the dup being logged (i.e. the
last 4 bytes of a directory buffer) and rabeeh#2 shows the corrupt values
being calculated from the length of the dup entry which overflowed
the size of the buffer.

It turns out that the problem was not in the logging code, nor in
the freespace handling code. It is an initial condition bug that
only shows up on 32 bit systems. When a new buffer is initialised,
where's the freespace that is set up:

[  172.316249] calling xfs_dir2_leaf_addname() from xfs_dir_createname()
[  172.316346] linux4kix#9 calling xfs_dir2_data_log_unused()
[  172.316351] rabeeh#1 calling xfs_trans_log_buf() 60 63 4096
[  172.316353] rabeeh#2 calling xfs_trans_log_buf() 4094 4095 4096

Note the offset of the first region being logged? It's 60 bytes into
the buffer. Once I saw that, I pretty much knew that the bug was
going to be caused by this.

Essentially, all direct entries are rounded to 8 bytes in length,
and all entries start with an 8 byte alignment. This means that we
can decode inplace as variables are naturally aligned. With the
directory data supposedly starting on a 8 byte boundary, and all
entries padded to 8 bytes, the minimum freespace in a directory
block is supposed to be 8 bytes, which is large enough to fit a
unused data entry structure (6 bytes in size). The fact we only have
4 bytes of free space indicates a directory data block alignment
problem.

And what do you know - there's an implicit hole in the directory
data block header for the CRC format, which means the header is 60
byte on 32 bit intel systems and 64 bytes on 64 bit systems. Needs
padding. And while looking at the structures, I found the same
problem in the attr leaf header. Fix them both.

Note that this only affects 32 bit systems with CRCs enabled.
Everything else is just fine. Note that CRC enabled filesystems created
before this fix on such systems will not be readable with this fix
applied.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Debugged-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Under ARM64, PTEs can be broadly categorised as follows:
   - Present and valid: Bit #0 is set. The PTE is valid and memory
     access to the region may fault.

   - Present and invalid: Bit #0 is clear and bit rabeeh#1 is set.
     Represents present memory with PROT_NONE protection. The PTE
     is an invalid entry, and the user fault handler will raise a
     SIGSEGV.

   - Not present (file or swap): Bits #0 and rabeeh#1 are clear.
     Memory represented has been paged out. The PTE is an invalid
     entry, and the fault handler will try and re-populate the
     memory where necessary.

Huge PTEs are block descriptors that have bit rabeeh#1 clear. If we wish
to represent PROT_NONE huge PTEs we then run into a problem as
there is no way to distinguish between regular and huge PTEs if we
set bit rabeeh#1.

To resolve this ambiguity this patch moves PTE_PROT_NONE from
bit rabeeh#1 to bit rabeeh#2 and moves PTE_FILE from bit rabeeh#2 to bit rabeeh#3. The
number of swap/file bits is reduced by 1 as a consequence, leaving
60 bits for file and swap entries.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…erspace

Running an OABI_COMPAT kernel on an SMP platform can lead to fun and
games with page aging.

If one CPU issues a swi instruction immediately before another CPU
decides to mkold the page containing the swi instruction, then we will
fault attempting to load the instruction during the vector_swi handler
in order to retrieve its immediate field. Since this fault is not
currently dealt with by our exception tables, this results in a panic:

  Unable to handle kernel paging request at virtual address 4020841c
  pgd = c490c000
  [4020841c] *pgd=84451831, *pte=bf05859d, *ppte=00000000
  Internal error: Oops: 17 [rabeeh#1] PREEMPT SMP ARM
  Modules linked in: hid_sony(O)
  CPU: 1    Tainted: G        W  O  (3.4.0-perf-gf496dca-01162-gcbcc62b rabeeh#1)
  PC is at vector_swi+0x28/0x88
  LR is at 0x40208420

This patch wraps all of the swi instruction loads with the USER macro
and provides a shared exception table entry which simply rewinds the
saved user PC and returns from the system call (without setting tbl, so
there's no worries with tracing or syscall restarting). Returning to
userspace will re-enter the page fault handler, from where we will
probably send SIGSEGV to the current task.

Reported-by: Wang, Yalin <yalin.wang@sonymobile.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
We had a report of a reproducible WARNING:

[ 1360.039358] ------------[ cut here ]------------
[ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0()
[ 1360.049880] Hardware name: HP Z200 Workstation
[ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd
auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm
snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel
snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr
iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod
sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci
drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
auth_rpcgss]
[ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G         I --------------   3.9.0-0.55.el7.x86_64 rabeeh#1
[ 1360.116771] Call Trace:
[ 1360.119219]  [<ffffffff810610c0>] warn_slowpath_common+0x70/0xa0
[ 1360.125208]  [<ffffffff810611aa>] warn_slowpath_null+0x1a/0x20
[ 1360.131025]  [<ffffffff811af46d>] d_set_d_op+0x8d/0xc0
[ 1360.136159]  [<ffffffffa05a7d6f>] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc]
[ 1360.143710]  [<ffffffffa05a8cc6>] rpc_mkpipe_dentry+0x86/0x170 [sunrpc]
[ 1360.150311]  [<ffffffffa062a7b6>] nfs_idmap_new+0x96/0x130 [nfsv4]
[ 1360.156475]  [<ffffffffa062e7cd>] nfs4_init_client+0xad/0x2d0 [nfsv4]
[ 1360.162902]  [<ffffffff812f02df>] ? idr_get_empty_slot+0x16f/0x3c0
[ 1360.169062]  [<ffffffff812f0582>] ? idr_mark_full+0x52/0x60
[ 1360.174615]  [<ffffffff812f0699>] ? idr_alloc+0x79/0xe0
[ 1360.179826]  [<ffffffffa0598081>] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
[ 1360.187635]  [<ffffffffa05980f3>] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
[ 1360.194493]  [<ffffffffa05d05da>] nfs_get_client+0x27a/0x350 [nfs]
[ 1360.200666]  [<ffffffffa062e438>] nfs4_set_client.isra.8+0x78/0x100 [nfsv4]
[ 1360.207624]  [<ffffffffa062f2f3>] nfs4_create_server+0xf3/0x3a0 [nfsv4]
[ 1360.214222]  [<ffffffffa06284be>] nfs4_remote_mount+0x2e/0x60 [nfsv4]
[ 1360.220644]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
[ 1360.225691]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
[ 1360.231348]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
[ 1360.236822]  [<ffffffffa0628396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
[ 1360.243246]  [<ffffffffa06287b4>] nfs4_try_mount+0x44/0xc0 [nfsv4]
[ 1360.249410]  [<ffffffffa05d1457>] ? get_nfs_version+0x27/0x80 [nfs]
[ 1360.255659]  [<ffffffffa05db985>] nfs_fs_mount+0x5c5/0xd10 [nfs]
[ 1360.261650]  [<ffffffffa05dc550>] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1360.268074]  [<ffffffffa05da8e0>] ? param_set_portnr+0x60/0x60 [nfs]
[ 1360.274406]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
[ 1360.279443]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
[ 1360.285088]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
[ 1360.290556]  [<ffffffff811b9f5d>] do_mount+0x1fd/0xa00
[ 1360.295677]  [<ffffffff81137dee>] ? __get_free_pages+0xe/0x50
[ 1360.301405]  [<ffffffff811b9be6>] ? copy_mount_options+0x36/0x170
[ 1360.307479]  [<ffffffff811ba7e3>] sys_mount+0x83/0xc0
[ 1360.312515]  [<ffffffff8160ad59>] system_call_fastpath+0x16/0x1b
[ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]---

The problem is that we're ending up in __rpc_lookup_create_exclusive
with a negative dentry that already has d_op set. A little debugging
has shown that when we hit this, the d_ops are already set to
simple_dentry_operations.

I believe that what's happening is that during a mount, idmapd is racing
in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap.
Before that dentry reference is released, the kernel races in to create
that file and finds the new negative dentry, which already has the
d_op set.

This patch just avoids setting the d_op if it's already set.
simple_dentry_operations and rpc_dentry_operations are functionally
equivalent so it shouldn't matter which one it's set to.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Don't sleep in __fscache_maybe_release_page() if __GFP_FS is not set.  This
goes some way towards mitigating fscache deadlocking against ext4 by way of
the allocator, eg:

INFO: task flush-8:0:24427 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0       D ffff88003e2b9fd8     0 24427      2 0x00000000
 ffff88003e2b9138 0000000000000046 ffff880012e3a040 ffff88003e2b9fd8
 0000000000011c80 ffff88003e2b9fd8 ffffffff81a10400 ffff880012e3a040
 0000000000000002 ffff880012e3a040 ffff88003e2b9098 ffffffff8106dcf5
Call Trace:
 [<ffffffff8106dcf5>] ? __lock_is_held+0x31/0x53
 [<ffffffff81219b61>] ? radix_tree_lookup_element+0xf4/0x12a
 [<ffffffff81454bed>] schedule+0x60/0x62
 [<ffffffffa01d349c>] __fscache_wait_on_page_write+0x8b/0xa5 [fscache]
 [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d
 [<ffffffffa01d393a>] __fscache_maybe_release_page+0x30c/0x324 [fscache]
 [<ffffffffa01d369a>] ? __fscache_maybe_release_page+0x6c/0x324 [fscache]
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffffa01fd7b2>] nfs_fscache_release_page+0x68/0x94 [nfs]
 [<ffffffffa01ef73e>] nfs_release_page+0x7e/0x86 [nfs]
 [<ffffffff810aa553>] try_to_release_page+0x32/0x3b
 [<ffffffff810b6c70>] shrink_page_list+0x535/0x71a
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffff810b7352>] shrink_inactive_list+0x20a/0x2dd
 [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea
 [<ffffffff810b7a65>] shrink_lruvec+0x34c/0x3eb
 [<ffffffff810b7bd3>] do_try_to_free_pages+0xcf/0x355
 [<ffffffff810b7fc8>] try_to_free_pages+0x9a/0xa1
 [<ffffffff810b08d2>] __alloc_pages_nodemask+0x494/0x6f7
 [<ffffffff810d9a07>] kmem_getpages+0x58/0x155
 [<ffffffff810dc002>] fallback_alloc+0x120/0x1f3
 [<ffffffff8106db23>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff810dbed3>] ____cache_alloc_node+0x177/0x186
 [<ffffffff81162a6c>] ? ext4_init_io_end+0x1c/0x37
 [<ffffffff810dc403>] kmem_cache_alloc+0xf1/0x176
 [<ffffffff810b17ac>] ? test_set_page_writeback+0x101/0x113
 [<ffffffff81162a6c>] ext4_init_io_end+0x1c/0x37
 [<ffffffff81162ce4>] ext4_bio_write_page+0x20f/0x3af
 [<ffffffff8115cc02>] mpage_da_submit_io+0x26e/0x2f6
 [<ffffffff811088e5>] ? __find_get_block_slow+0x38/0x133
 [<ffffffff81161348>] mpage_da_map_and_submit+0x3a7/0x3bd
 [<ffffffff81161a60>] ext4_da_writepages+0x30d/0x426
 [<ffffffff810b3359>] do_writepages+0x1c/0x2a
 [<ffffffff81102f4d>] __writeback_single_inode+0x3e/0xe5
 [<ffffffff81103995>] writeback_sb_inodes+0x1bd/0x2f4
 [<ffffffff81103b3b>] __writeback_inodes_wb+0x6f/0xb4
 [<ffffffff81103c81>] wb_writeback+0x101/0x195
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffff811043aa>] ? wb_do_writeback+0xaa/0x173
 [<ffffffff8110434a>] wb_do_writeback+0x4a/0x173
 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81038554>] ? del_timer+0x4b/0x5b
 [<ffffffff811044e0>] bdi_writeback_thread+0x6d/0x147
 [<ffffffff81104473>] ? wb_do_writeback+0x173/0x173
 [<ffffffff81048fbc>] kthread+0xd0/0xd8
 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
2 locks held by flush-8:0/24427:
 #0:  (&type->s_umount_key#41){.+.+..}, at: [<ffffffff810e3b73>] grab_super_passive+0x4c/0x76
 rabeeh#1:  (jbd2_handle){+.+...}, at: [<ffffffff81190d81>] start_this_handle+0x475/0x4ea


The problem here is that another thread, which is attempting to write the
to-be-stored NFS page to the on-ext4 cache file is waiting for the journal
lock, eg:

INFO: task kworker/u:2:24437 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff880039589768     0 24437      2 0x00000000
 ffff8800395896d8 0000000000000046 ffff8800283bf040 ffff880039589fd8
 0000000000011c80 ffff880039589fd8 ffff880039f0b040 ffff8800283bf040
 0000000000000006 ffff8800283bf6b8 ffff880039589658 ffffffff81071a13
Call Trace:
 [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea
 [<ffffffff81455e73>] ? _raw_spin_unlock_irqrestore+0x3a/0x50
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81454bed>] schedule+0x60/0x62
 [<ffffffff81190c23>] start_this_handle+0x317/0x4ea
 [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d
 [<ffffffff81190fcc>] jbd2__journal_start+0xb3/0x12e
 [<ffffffff81176606>] __ext4_journal_start_sb+0xb2/0xc6
 [<ffffffff8115f137>] ext4_da_write_begin+0x109/0x233
 [<ffffffff810a964d>] generic_file_buffered_write+0x11a/0x264
 [<ffffffff811032cf>] ? __mark_inode_dirty+0x2d/0x1ee
 [<ffffffff810ab1ab>] __generic_file_aio_write+0x2a5/0x2d5
 [<ffffffff810ab24a>] generic_file_aio_write+0x6f/0xd0
 [<ffffffff81159a2c>] ext4_file_write+0x38c/0x3c4
 [<ffffffff810e0915>] do_sync_write+0x91/0xd1
 [<ffffffffa00a17f0>] cachefiles_write_page+0x26f/0x310 [cachefiles]
 [<ffffffffa01d470b>] fscache_write_op+0x21e/0x37a [fscache]
 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffffa01d2479>] fscache_op_work_func+0x78/0xd7 [fscache]
 [<ffffffff8104455a>] process_one_work+0x232/0x3a8
 [<ffffffff810444ff>] ? process_one_work+0x1d7/0x3a8
 [<ffffffff81044ee0>] worker_thread+0x214/0x303
 [<ffffffff81044ccc>] ? manage_workers+0x245/0x245
 [<ffffffff81048fbc>] kthread+0xd0/0xd8
 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
4 locks held by kworker/u:2/24437:
 #0:  (fscache_operation){.+.+.+}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8
 rabeeh#1:  ((&op->work)){+.+.+.}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8
 rabeeh#2:  (sb_writers#14){.+.+.+}, at: [<ffffffff810ab22c>] generic_file_aio_write+0x51/0xd0
 rabeeh#3:  (&sb->s_type->i_mutex_key#19){+.+.+.}, at: [<ffffffff810ab236>] generic_file_aio_write+0x5b/0x

fscache already tries to cancel pending stores, but it can't cancel a write
for which I/O is already in progress.

An alternative would be to accept writing garbage to the cache under extreme
circumstances and to kill the afflicted cache object if we have to do this.
However, we really need to know how strapped the allocator is before deciding
to do that.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-By: Milosz Tanski <milosz@adfin.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
The call stack below shows how this happens: basically eager_fpu_init()
calls __thread_fpu_begin(current) which then does if (!use_eager_fpu()),
which, in turn, uses static_cpu_has.

And we're executing before alternatives so static_cpu_has doesn't work
there yet.

Use the safe variant in this path which becomes optimal after
alternatives have run.

WARNING: at arch/x86/kernel/cpu/common.c:1368 warn_pre_alternatives+0x1e/0x20()
You're using static_cpu_has before alternatives have run!
Modules linked in:
Pid: 0, comm: swapper Not tainted 3.9.0-rc8+ rabeeh#1
Call Trace:
 warn_slowpath_common
 warn_slowpath_fmt
 ? fpu_finit
 warn_pre_alternatives
 eager_fpu_init
 fpu_init
 cpu_init
 trap_init
 start_kernel
 ? repair_env_string
 x86_64_start_reservations
 x86_64_start_kernel

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1370772454-6106-6-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ack page

Historically, pt_regs would end at offset of 1 word from end of stack
page.

        -----------------  -> START of page (task->stack)
        |               |
        | thread_info   |
        -----------------
        |               |
   ^    ~               ~
   |    ~               ~
   |    |               |
   |    |               | <---- pt_regs used to END here
        -----------------
        | 1 word GUTTER |
        ----------------- -> End of page (START of kernel stack)

This required special "one-off" considerations in low level code.

The root cause is very likely assumption of "empty" SP by the original
ARC kernel hackers, despite ARC700 always been "full" SP.

So finally RIP one word gutter !

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
The length of the registers area for the Marvell 370/XP Ethernet controller
was incorrect in the .dtsi: 0x2500, while it should have been 0x4000.
This problem wasn't noticed because there used to be a static mapping for
all the MMIO register region set up by ->map_io().

The register length was fixed in all the other device tree files,
except from the armada-xp-mv78260.dtsi, in the following commit:

  commit cf8088c
  Author: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
  Date:   Tue May 21 12:33:27 2013 +0200

    arm: mvebu: fix length of Ethernet registers area in .dtsi

This commit fixes a kernel panic in mvneta_probe(), when the kernel
tries to access the unmapped registers:

[  163.639092] mvneta d0070000.ethernet eth0: mac: 6e:3c:4f:87:17:2e
[  163.646962] mvneta d0074000.ethernet eth1: mac: 6a:04:4e:6f:f5:ef
[  163.654853] mvneta d0030000.ethernet eth2: mac: 2a:99:19:19:fc:4c
[  163.661258] Unable to handle kernel paging request at virtual address f011bcf0
[  163.668523] pgd = c0004000
[  163.671237] [f011bcf0] *pgd=2f006811, *pte=00000000, *ppte=00000000
[  163.677565] Internal error: Oops: 807 [rabeeh#1] SMP ARM
[  163.682370] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc6-01850-gba0682e linux4kix#11
[  163.690046] task: ef04c000 ti: ef03e000 task.ti: ef03e000
[  163.695467] PC is at mvneta_probe+0x34c/0xabc
[...]

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
On a CPU that never ran anything, both the active and reserved ASID
fields are set to zero. In this case the ASID_TO_IDX() macro will
return -1, which is not a very useful value to index a bitmap.

Instead of trying to offset the ASID so that ASID rabeeh#1 is actually
bit 0 in the asid_map bitmap, just always ignore bit 0 and start
the search from bit 1. This makes the code a bit more readable,
and without risk of OoB access.

Cc: <stable@vger.kernel.org> # 3.9
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
cgroupfs_root used to have ->actual_subsys_mask in addition to
->subsys_mask.  a8a648c ("cgroup: remove
cgroup->actual_subsys_mask") removed it noting that the subsys_mask is
essentially temporary and doesn't belong in cgroupfs_root; however,
the patch made it impossible to tell whether a cgroupfs_root actually
has the subsystems bound or just have the bits set leading to the
following BUG when trying to mount with subsystems which are already
mounted elsewhere.

 kernel BUG at kernel/cgroup.c:1038!
 invalid opcode: 0000 [rabeeh#1] PREEMPT SMP DEBUG_PAGEALLOC
 ...
 CPU: 1 PID: 7973 Comm: mount Tainted: G        W    3.10.0-rc7-next-20130625-sasha-00011-g1c1dc0e #1105
 task: ffff880fc0ae8000 ti: ffff880fc0b9a000 task.ti: ffff880fc0b9a000
 RIP: 0010:[<ffffffff81249b29>]  [<ffffffff81249b29>] rebind_subsystems+0x409/0x5f0
 ...
 Call Trace:
  [<ffffffff8124bd4f>] cgroup_kill_sb+0xff/0x210
  [<ffffffff813d21af>] deactivate_locked_super+0x4f/0x90
  [<ffffffff8124f3b3>] cgroup_mount+0x673/0x6e0
  [<ffffffff81257169>] cpuset_mount+0xd9/0x110
  [<ffffffff813d2580>] mount_fs+0xb0/0x2d0
  [<ffffffff81404afd>] vfs_kern_mount+0xbd/0x180
  [<ffffffff814070b5>] do_new_mount+0x145/0x2c0
  [<ffffffff814085d6>] do_mount+0x356/0x3c0
  [<ffffffff8140873d>] SyS_mount+0xfd/0x140
  [<ffffffff854eb600>] tracesys+0xdd/0xe2

We still want rebind_subsystems() to take added/removed masks, so
let's fix it by marking whether a cgroupfs_root has finished binding
or not.  Also, document what's going on around ->subsys_mask
initialization so that similar mistakes aren't repeated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Li Zefan <lizefan@huawei.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
eb178d0 ("cgroup: grab cgroup_mutex in
drop_parsed_module_refcounts()") made drop_parsed_module_refcounts()
grab cgroup_mutex to make lockdep assertion in for_each_subsys()
happy.  Unfortunately, cgroup_remount() calls the function while
holding cgroup_mutex in its failure path leading to the following
deadlock.

# mount -t cgroup -o remount,memory,blkio cgroup blkio

 cgroup: option changes via remount are deprecated (pid=525 comm=mount)

 =============================================
 [ INFO: possible recursive locking detected ]
 3.10.0-rc4-work+ rabeeh#1 Not tainted
 ---------------------------------------------
 mount/525 is trying to acquire lock:
  (cgroup_mutex){+.+.+.}, at: [<ffffffff8110a3e1>] drop_parsed_module_refcounts+0x21/0xb0

 but task is already holding lock:
  (cgroup_mutex){+.+.+.}, at: [<ffffffff8110e4e1>] cgroup_remount+0x51/0x200

 other info that might help us debug this:
  Possible unsafe locking scenario:

	CPU0
	----
   lock(cgroup_mutex);
   lock(cgroup_mutex);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 4 locks held by mount/525:
  #0:  (&type->s_umount_key#30){+.+...}, at: [<ffffffff811e9a0d>] do_mount+0x2bd/0xa30
  rabeeh#1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff8110e4d3>] cgroup_remount+0x43/0x200
  rabeeh#2:  (cgroup_mutex){+.+.+.}, at: [<ffffffff8110e4e1>] cgroup_remount+0x51/0x200
  rabeeh#3:  (cgroup_root_mutex){+.+.+.}, at: [<ffffffff8110e4ef>] cgroup_remount+0x5f/0x200

 stack backtrace:
 CPU: 2 PID: 525 Comm: mount Not tainted 3.10.0-rc4-work+ rabeeh#1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  ffffffff829651f0 ffff88000ec2fc28 ffffffff81c24bb1 ffff88000ec2fce8
  ffffffff810f420d 0000000000000006 0000000000000001 0000000000000056
  ffff8800153b4640 ffff880000000000 ffffffff81c2e468 ffff8800153b4640
 Call Trace:
  [<ffffffff81c24bb1>] dump_stack+0x19/0x1b
  [<ffffffff810f420d>] __lock_acquire+0x15dd/0x1e60
  [<ffffffff810f531c>] lock_acquire+0x9c/0x1f0
  [<ffffffff81c2a805>] mutex_lock_nested+0x65/0x410
  [<ffffffff8110a3e1>] drop_parsed_module_refcounts+0x21/0xb0
  [<ffffffff8110e63e>] cgroup_remount+0x1ae/0x200
  [<ffffffff811c9bb2>] do_remount_sb+0x82/0x190
  [<ffffffff811e9d41>] do_mount+0x5f1/0xa30
  [<ffffffff811ea203>] SyS_mount+0x83/0xc0
  [<ffffffff81c2fb82>] system_call_fastpath+0x16/0x1b

Fix it by moving the drop_parsed_module_refcounts() invocation outside
cgroup_mutex.

Signed-off-by: Tejun Heo <tj@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
It seems following race is possible:

	cpu0					cpux
smp_init->cpu_up->_cpu_up
	__cpu_up
		kick_cpu(1)
-------------------------------------------------------------------------
		waiting online			...
		...				notify CPU_STARTING
							set cpux active
						set cpux online
-------------------------------------------------------------------------
		finish waiting online
		...
sched_init_smp
	init_sched_domains(cpu_active_mask)
		build_sched_domains
						set cpux sibling info
-------------------------------------------------------------------------

Execution of cpu0 and cpux could be concurrent between two separator
lines.

So if the cpux sibling information was set too late (normally
impossible, but could be triggered by adding some delay in
start_secondary, after setting cpu online), build_sched_domains()
running on cpu0 might see cpux active, with an empty sibling mask, then
cause some bad address accessing like following:

[    0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f
[    0.099868] Faulting instruction address: 0xc0000000000b7a64
[    0.099883] Oops: Kernel access of bad area, sig: 11 [rabeeh#1]
[    0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
[    0.099922] Modules linked in:
[    0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425-dirty linux4kix#16
[    0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000
[    0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934
[    0.099985] REGS: c0000001fed7f760 TRAP: 0300   Not tainted  (3.10.0-rc1-00120-gb973425-dirty)
[    0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 24272828  XER: 20000003
[    0.100045] SOFTE: 1
[    0.100053] CFAR: c000000000445ee8
[    0.100064] DAR: c00000038518078f, DSISR: 40000000
[    0.100073]
GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010
GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000
GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f
GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000
GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000
GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000
GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0
[    0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4
[    0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4
[    0.100318] Call Trace:
[    0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable)
[    0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14
[    0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654
[    0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c
[    0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c
[    0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68
[    0.100445] Instruction dump:
[    0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001
[    0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010
[    0.100559] ---[ end trace 31fd0ba7d8756001 ]---

This patch tries to move the sibling maps updating before
notify_cpu_starting() and cpu online, and a write barrier there to make
sure sibling maps are updated before active and online mask.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
The MC8305 module got an additional entry added based solely on
information from a Windows driver *.inf file. We now have the
actual descriptor layout from one of these modules, and it
consists of two alternate configurations where cfg rabeeh#1 is a
normal Gobi 2k layout and cfg rabeeh#2 is MBIM only, using interface
numbers 5 and 6 for MBIM control and data. The extra Windows
driver entry for interface number 5 was most likely a bug.

Deleting the bogus entry to avoid unnecessary qmi_wwan probe
failures when using the MBIM configuration.

Reported-by: Lana Black <sickmind@lavabit.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
dingtianhong reported the following deadlock detected by lockdep:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.4.24.05-0.1-default rabeeh#1 Not tainted
 -------------------------------------------------------
 ksoftirqd/0/3 is trying to acquire lock:
  (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120

 but task is already holding lock:
  (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> rabeeh#1 (&mc->mca_lock){+.+...}:
        [<ffffffff810a8027>] validate_chain+0x637/0x730
        [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
        [<ffffffff810a8734>] lock_acquire+0x114/0x150
        [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
        [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
        [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
        [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
        [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
        [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
        [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
        [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
        [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
        [<ffffffff813d9360>] dev_change_flags+0x40/0x70
        [<ffffffff813ea627>] do_setlink+0x237/0x8a0
        [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
        [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
        [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
        [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
        [<ffffffff81403e20>] netlink_unicast+0x140/0x180
        [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
        [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
        [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
        [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
        [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b

 -> #0 (&ndev->lock){+.+...}:
        [<ffffffff810a798e>] check_prev_add+0x3de/0x440
        [<ffffffff810a8027>] validate_chain+0x637/0x730
        [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
        [<ffffffff810a8734>] lock_acquire+0x114/0x150
        [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
        [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
        [<ffffffff8149b036>] mld_newpack+0xb6/0x160
        [<ffffffff8149b18b>] add_grhead+0xab/0xc0
        [<ffffffff8149d03b>] add_grec+0x3ab/0x460
        [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
        [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
        [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
        [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
        [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
        [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
        [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
        [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
        [<ffffffff8106c7de>] kthread+0xae/0xc0
        [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10

actually we can just hold idev->lock before taking pmc->mca_lock,
and avoid taking idev->lock again when iterating idev->addr_list,
since the upper callers of mld_newpack() already take
read_lock_bh(&idev->lock).

Reported-by: dingtianhong <dingtianhong@huawei.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Chen Weilong <chenweilong@huawei.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ET pending data

We accidentally call down to ip6_push_pending_frames when uncorking
pending AF_INET data on a ipv6 socket. This results in the following
splat (from Dave Jones):

skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:126!
invalid opcode: 0000 [rabeeh#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
+netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
RIP: 0010:[<ffffffff816e759c>]  [<ffffffff816e759c>] skb_panic+0x63/0x65
RSP: 0018:ffff8801e6431de8  EFLAGS: 00010282
RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
FS:  00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
Call Trace:
 [<ffffffff8159a9aa>] skb_push+0x3a/0x40
 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0
 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140
 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0
 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20
 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0
 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0
 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40
 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20
 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0
 [<ffffffff816f5d54>] tracesys+0xdd/0xe2
Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
RIP  [<ffffffff816e759c>] skb_panic+0x63/0x65
 RSP <ffff8801e6431de8>

This patch adds a check if the pending data is of address family AF_INET
and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
if that is the case.

This bug was found by Dave Jones with trinity.

(Also move the initialization of fl6 below the AF_INET check, even if
not strictly necessary.)

Cc: Dave Jones <davej@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track
of this when appending the second frame on a corked socket. This results
in the following splat:

[37598.993962] ------------[ cut here ]------------
[37598.994008] kernel BUG at net/core/skbuff.c:2064!
[37598.994008] invalid opcode: 0000 [rabeeh#1] SMP
[37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat
+nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi
+scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm
[37598.994008]  snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc
+dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video
[37598.994008] CPU 0
[37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 rabeeh#1 LENOVO 27744PG/27744PG
[37598.994008] RIP: 0010:[<ffffffff815443a5>]  [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330
[37598.994008] RSP: 0018:ffff88003670da18  EFLAGS: 00010202
[37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0
[37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00
[37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040
[37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8
[37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000
[37598.994008] FS:  00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000
[37598.994008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0
[37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0)
[37598.994008] Stack:
[37598.994008]  ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8
[37598.994008]  ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200
[37598.994008]  0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4
[37598.994008] Call Trace:
[37598.994008]  [<ffffffff815fc21f>] ip6_append_data+0xccf/0xfe0
[37598.994008]  [<ffffffff8158d9f0>] ? ip_copy_metadata+0x1a0/0x1a0
[37598.994008]  [<ffffffff81661f66>] ? _raw_spin_lock_bh+0x16/0x40
[37598.994008]  [<ffffffff8161548d>] udpv6_sendmsg+0x1ed/0xc10
[37598.994008]  [<ffffffff812a2845>] ? sock_has_perm+0x75/0x90
[37598.994008]  [<ffffffff815c3693>] inet_sendmsg+0x63/0xb0
[37598.994008]  [<ffffffff812a2973>] ? selinux_socket_sendmsg+0x23/0x30
[37598.994008]  [<ffffffff8153a450>] sock_sendmsg+0xb0/0xe0
[37598.994008]  [<ffffffff810135d1>] ? __switch_to+0x181/0x4a0
[37598.994008]  [<ffffffff8153d97d>] sys_sendto+0x12d/0x180
[37598.994008]  [<ffffffff810dfb64>] ? __audit_syscall_entry+0x94/0xf0
[37598.994008]  [<ffffffff81020ed1>] ? syscall_trace_enter+0x231/0x240
[37598.994008]  [<ffffffff8166a7e7>] tracesys+0xdd/0xe2
[37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48
[37598.994008] RIP  [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330
[37598.994008]  RSP <ffff88003670da18>
[37599.007323] ---[ end trace d69f6a17f8ac8eee ]---

While there, also check if path mtu discovery is activated for this
socket. The logic was adapted from ip6_append_data when first writing
on the corked socket.

This bug was introduced with commit
0c18337 ("ipv6: fix incorrect ipsec
fragment").

v2:
a) Replace IPV6_PMTU_DISC_DO with IPV6_PMTUDISC_PROBE.
b) Don't pass ipv6_pinfo to ip6_append_data_mtu (suggestion by Gao
   feng, thanks!).
c) Change mtu to unsigned int, else we get a warning about
   non-matching types because of the min()-macro type-check.

Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…/kernel/git/vgupta/arc

Pull first batch of ARC changes from Vineet Gupta:
 "There's a second bunch to follow next week - which depends on commits
  on other trees (irq/net).  I'd have preferred the accompanying ARC
  change via respective trees, but it didn't workout somehow.

  Highlights of changes:

   - Continuation of ARC MM changes from 3.10 including

       zero page optimization
       Setting pagecache pages dirty by default
       Non executable stack by default
       Reducing dcache flushes for aliasing VIPT config

   - Long overdue rework of pt_regs machinery - removing the unused word
     gutters and adding ECR register to baseline (helps cleanup lot of
     low level code)

   - Support for ARC gcc 4.8

   - Few other preventive fixes, cosmetics, usage of Kconfig helper..

  The diffstat is larger than normal primarily because of arcregs.h
  header split as well as beautification of macros in entry.h"

* tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (32 commits)
  ARC: warn on improper stack unwind FDE entries
  arc: delete __cpuinit usage from all arc files
  ARC: [tlb-miss] Fix bug with CONFIG_ARC_DBG_TLB_MISS_COUNT
  ARC: [tlb-miss] Extraneous PTE bit testing/setting
  ARC: Adjustments for gcc 4.8
  ARC: Setup Vector Table Base in early boot
  ARC: Remove explicit passing around of ECR
  ARC: pt_regs update rabeeh#5: Use real ECR for pt_regs->event vs. synth values
  ARC: stop using pt_regs->orig_r8
  ARC: pt_regs update rabeeh#4: r25 saved/restored unconditionally
  ARC: K/U SP saved from one location in stack switching macro
  ARC: Entry Handler tweaks: Simplify branch for in-kernel preemption
  ARC: Entry Handler tweaks: Avoid hardcoded LIMMS for ECR values
  ARC: Increase readability of entry handlers
  ARC: pt_regs update rabeeh#3: Remove unused gutter at start of callee_regs
  ARC: pt_regs update rabeeh#2: Remove unused gutter at start of pt_regs
  ARC: pt_regs update rabeeh#1: Align pt_regs end with end of kernel stack page
  ARC: pt_regs update #0: remove kernel stack canary
  ARC: [mm] Remove @Write argument to do_page_fault()
  ARC: [mm] Make stack/heap Non-executable by default
  ...
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
when mounting ceph with a dev name that starts with a slash, ceph
would attempt to access the character before that slash. Since we
don't actually own that byte of memory, we would trigger an
invalid access:

[   43.499934] BUG: unable to handle kernel paging request at ffff880fa3a97fff
[   43.500984] IP: [<ffffffff818f3884>] parse_mount_options+0x1a4/0x300
[   43.501491] PGD 743b067 PUD 10283c4067 PMD 10282a6067 PTE 8000000fa3a97060
[   43.502301] Oops: 0000 [rabeeh#1] PREEMPT SMP DEBUG_PAGEALLOC
[   43.503006] Dumping ftrace buffer:
[   43.503596]    (ftrace buffer empty)
[   43.504046] CPU: 0 PID: 10879 Comm: mount Tainted: G        W    3.10.0-sasha #1129
[   43.504851] task: ffff880fa625b000 ti: ffff880fa3412000 task.ti: ffff880fa3412000
[   43.505608] RIP: 0010:[<ffffffff818f3884>]  [<ffffffff818f3884>] parse_mount_options$
[   43.506552] RSP: 0018:ffff880fa3413d08  EFLAGS: 00010286
[   43.507133] RAX: ffff880fa3a98000 RBX: ffff880fa3a98000 RCX: 0000000000000000
[   43.507893] RDX: ffff880fa3a98001 RSI: 000000000000002f RDI: ffff880fa3a98000
[   43.508610] RBP: ffff880fa3413d58 R08: 0000000000001f99 R09: ffff880fa3fe64c0
[   43.509426] R10: ffff880fa3413d98 R11: ffff880fa38710d8 R12: ffff880fa3413da0
[   43.509792] R13: ffff880fa3a97fff R14: 0000000000000000 R15: ffff880fa3413d90
[   43.509792] FS:  00007fa9c48757e0(0000) GS:ffff880fd2600000(0000) knlGS:000000000000$
[   43.509792] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   43.509792] CR2: ffff880fa3a97fff CR3: 0000000fa3bb9000 CR4: 00000000000006b0
[   43.509792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   43.509792] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   43.509792] Stack:
[   43.509792]  0000e5180000000e ffffffff85ca1900 ffff880fa38710d8 ffff880fa3413d98
[   43.509792]  0000000000000120 0000000000000000 ffff880fa3a98000 0000000000000000
[   43.509792]  ffffffff85cf32a0 0000000000000000 ffff880fa3413dc8 ffffffff818f3c72
[   43.509792] Call Trace:
[   43.509792]  [<ffffffff818f3c72>] ceph_mount+0xa2/0x390
[   43.509792]  [<ffffffff81226314>] ? pcpu_alloc+0x334/0x3c0
[   43.509792]  [<ffffffff81282f8d>] mount_fs+0x8d/0x1a0
[   43.509792]  [<ffffffff812263d0>] ? __alloc_percpu+0x10/0x20
[   43.509792]  [<ffffffff8129f799>] vfs_kern_mount+0x79/0x100
[   43.509792]  [<ffffffff812a224d>] do_new_mount+0xcd/0x1c0
[   43.509792]  [<ffffffff812a2e8d>] do_mount+0x15d/0x210
[   43.509792]  [<ffffffff81220e55>] ? strndup_user+0x45/0x60
[   43.509792]  [<ffffffff812a2fdd>] SyS_mount+0x9d/0xe0
[   43.509792]  [<ffffffff83fd816c>] tracesys+0xdd/0xe2
[   43.509792] Code: 4c 8b 5d c0 74 0a 48 8d 50 01 49 89 14 24 eb 17 31 c0 48 83 c9 ff $
[   43.509792] RIP  [<ffffffff818f3884>] parse_mount_options+0x1a4/0x300
[   43.509792]  RSP <ffff880fa3413d08>
[   43.509792] CR2: ffff880fa3a97fff
[   43.509792] ---[ end trace 22469cd81e93af51 ]---

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Sage Weil <sage@inktan.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Inlined xattr shared free space of inode block with inlined data or data
extent record, so the size of the later two should be adjusted when
inlined xattr is enabled.  See ocfs2_xattr_ibody_init().  But this isn't
done well when reflink.  For inode with inlined data, its max inlined
data size is adjusted in ocfs2_duplicate_inline_data(), no problem.  But
for inode with data extent record, its record count isn't adjusted.  Fix
it, or data extent record and inlined xattr may overwrite each other,
then cause data corruption or xattr failure.

One panic caused by this bug in our test environment is the following:

  kernel BUG at fs/ocfs2/xattr.c:1435!
  invalid opcode: 0000 [rabeeh#1] SMP
  Pid: 10871, comm: multi_reflink_t Not tainted 2.6.39-300.17.1.el5uek rabeeh#1
  RIP: ocfs2_xa_offset_pointer+0x17/0x20 [ocfs2]
  RSP: e02b:ffff88007a587948  EFLAGS: 00010283
  RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00000000000051e4
  RDX: ffff880057092060 RSI: 0000000000000f80 RDI: ffff88007a587a68
  RBP: ffff88007a587948 R08: 00000000000062f4 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000010
  R13: ffff88007a587a68 R14: 0000000000000001 R15: ffff88007a587c68
  FS:  00007fccff7f06e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
  CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00000000015cf000 CR3: 000000007aa76000 CR4: 0000000000000660
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process multi_reflink_t
  Call Trace:
    ocfs2_xa_reuse_entry+0x60/0x280 [ocfs2]
    ocfs2_xa_prepare_entry+0x17e/0x2a0 [ocfs2]
    ocfs2_xa_set+0xcc/0x250 [ocfs2]
    ocfs2_xattr_ibody_set+0x98/0x230 [ocfs2]
    __ocfs2_xattr_set_handle+0x4f/0x700 [ocfs2]
    ocfs2_xattr_set+0x6c6/0x890 [ocfs2]
    ocfs2_xattr_user_set+0x46/0x50 [ocfs2]
    generic_setxattr+0x70/0x90
    __vfs_setxattr_noperm+0x80/0x1a0
    vfs_setxattr+0xa9/0xb0
    setxattr+0xc3/0x120
    sys_fsetxattr+0xa8/0xd0
    system_call_fastpath+0x16/0x1b

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Since commit 2025172 (spi/bitbang: Use core message pump), the following
kernel crash is seen:

Unable to handle kernel NULL pointer dereference at virtual address 0000000d
pgd = 80004000
[0000000d] *pgd=00000000
Internal error: Oops: 5 [rabeeh#1] SMP ARM
Modules linked in:
CPU: 1 PID: 48 Comm: spi32766 Not tainted 3.11.0-rc1+ rabeeh#4
task: bfa3e580 ti: bfb90000 task.ti: bfb90000
PC is at spi_bitbang_transfer_one+0x50/0x248
LR is at spi_bitbang_transfer_one+0x20/0x248
...

,and also the following build warning:

drivers/spi/spi-bitbang.c: In function 'spi_bitbang_start':
drivers/spi/spi-bitbang.c:436:31: warning: assignment from incompatible pointer type [enabled by default]

In order to fix it, we need to change the first parameter of
spi_bitbang_transfer_one() to 'struct spi_master *master'.

Tested on a mx6qsabrelite by succesfully probing a SPI NOR flash.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
o Driver was freeing Tx frag which was never mapped before which
  result into panic as kernel was unable to handle paging request.

BUG: unable to handle kernel paging request at ffffc9002552a000
IP: [<ffffffffa05ed762>] qlcnic_release_tx_buffers+0x72/0x170 [qlcnic]
PGD 87fc15067 PUD 47febf067 PMD 4758c5067 PTE 0
Oops: 0000 [rabeeh#1] SMP

crash> bt
PID: 27343  TASK: ffff8802a5de8080  CPU: 27  COMMAND: "ifconfig"
   [ffff8802a34b3850] machine_kexec at ffffffff81035b7b
   [ffff8802a34b38b0] crash_kexec at ffffffff810c0db2
   [ffff8802a34b3980] oops_end at ffffffff815111d0
   [ffff8802a34b39b0] no_context at ffffffff81046bfb
   [ffff8802a34b3a00] __bad_area_nosemaphore at ffffffff81046e85
   [ffff8802a34b3a50] bad_area_nosemaphore at ffffffff81046f53
   [ffff8802a34b3a60] __do_page_fault at ffffffff810476b1
   [ffff8802a34b3b80] do_page_fault at ffffffff8151311e
   [ffff8802a34b3bb0] page_fault at ffffffff815104d5
    [exception RIP: qlcnic_release_tx_buffers+114]
    RIP: ffffffffa05ed762  RSP: ffff8802a34b3c68  RFLAGS: 00010246
    RAX: ffff88087989c000  RBX: ffffc90025529ff8  RCX: 0000000000000001
    RDX: 0000000000000013  RSI: 0000000000000013  RDI: 0000000000000000
    RBP: ffff8802a34b3ca8   R8: 0000000000000000   R9: 0000000000000000
    R10: 000000000000000c  R11: 0000000000000000  R12: 0000000000000012
    R13: ffffc90025529ec0  R14: ffff880761e876e0  R15: 00000000000003ff
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   [ffff8802a34b3cb0] __qlcnic_down at ffffffffa05e8b15 [qlcnic]
   [ffff8802a34b3d00] qlcnic_close at ffffffffa05e8b78 [qlcnic]
   [ffff8802a34b3d10] dev_close at ffffffff81449d81
   [ffff8802a34b3d30] dev_change_flags at ffffffff814495c1

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always().  We can use the following program to trigger it:

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_TMPFILE, 0666);
	if (fd < 0) {
		perror("open ");
		return -1;
	}
	close(fd);
	return 0;
}

The oops message looks like this:

kernel BUG at fs/ext4/namei.c:2572!
invalid opcode: 0000 [rabeeh#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli
nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir
da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp
kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm
CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ linux4kix#12
Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000
RIP: 0010:[<ffffffff8125ce69>]  [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0
RSP: 0018:ffff88010f7b7cf8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001
RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668
R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0
FS:  00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0
DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00
 ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c
 ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004
Call Trace:
 [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180
 [<ffffffff811cba78>] path_openat+0x238/0x700
 [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80
 [<ffffffff811cc647>] do_filp_open+0x47/0xa0
 [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200
 [<ffffffff811ba2e4>] do_sys_open+0x124/0x210
 [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290
 [<ffffffff811ba3ee>] SyS_open+0x1e/0x20
 [<ffffffff816ca8d4>] tracesys+0xdd/0xe2
 [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0
Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00

Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always().  We can use the following program to trigger it:

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_TMPFILE, 0666);
	if (fd < 0) {
		perror("open ");
		return -1;
	}
	close(fd);
	return 0;
}

The oops message looks like this:

kernel: kernel BUG at fs/ext3/namei.c:1992!
kernel: invalid opcode: 0000 [rabeeh#1] SMP
kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd
kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ rabeeh#4
kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010
kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000
kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
kernel: RSP: 0018:ffff8801124d5cc8  EFLAGS: 00010202
kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0
kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8
kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f
kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000
kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8
kernel: FS:  00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0
kernel: Stack:
kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7
kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000
kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1
kernel: Call Trace:
kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd]
kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3]
kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7
kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30
kernel: [<ffffffff82065fa2>] ?  __dequeue_entity+0x33/0x38
kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d
kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102
kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd
kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20
kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b
kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0
kernel: RIP  [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
kernel: RSP <ffff8801124d5cc8>

Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
In xfs_vm_write_failed(), we evaluate the block_offset of pos with
PAGE_MASK which is an unsigned long.  That is fine on 64-bit platforms
regardless of whether the request pos is 32-bit or 64-bit.  However, on
32-bit platforms the value is 0xfffff000 and so the high 32 bits in it
will be masked off with (pos & PAGE_MASK) for a 64-bit pos.

As a result, the evaluated block_offset is incorrect which will cause
this failure ASSERT(block_offset + from == pos); and potentially pass
the wrong block to xfs_vm_kill_delalloc_range().

In this case, we can get a kernel panic if CONFIG_XFS_DEBUG is enabled:

XFS: Assertion failed: block_offset + from == pos, file: fs/xfs/xfs_aops.c, line: 1504

------------[ cut here ]------------
 kernel BUG at fs/xfs/xfs_message.c:100!
 invalid opcode: 0000 [rabeeh#1] SMP
 ........
 Pid: 4057, comm: mkfs.xfs Tainted: G           O 3.9.0-rc2 rabeeh#1
 EIP: 0060:[<f94a7e8b>] EFLAGS: 00010282 CPU: 0
 EIP is at assfail+0x2b/0x30 [xfs]
 EAX: 00000056 EBX: f6ef28a0 ECX: 00000007 EDX: f57d22a4
 ESI: 1c2fb000 EDI: 00000000 EBP: ea6b5d30 ESP: ea6b5d1c
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
 CR0: 8005003b CR2: 094f3ff4 CR3: 2bcb4000 CR4: 000006f0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
 Process mkfs.xfs (pid: 4057, ti=ea6b4000 task=ea5799e0 task.ti=ea6b4000)
 Stack:
 00000000 f9525c48 f951fa80 f951f96b 000005e4 ea6b5d7c f9494b34 c19b0ea2
 00000066 f3d6c620 c19b0ea2 00000000 e9a91458 00001000 00000000 00000000
 00000000 c15c7e89 00000000 1c2fb000 00000000 00000000 1c2fb000 00000080
 Call Trace:
 [<f9494b34>] xfs_vm_write_failed+0x74/0x1b0 [xfs]
 [<c15c7e89>] ? printk+0x4d/0x4f
 [<f9494d7d>] xfs_vm_write_begin+0x10d/0x170 [xfs]
 [<c110a34c>] generic_file_buffered_write+0xdc/0x210
 [<f949b669>] xfs_file_buffered_aio_write+0xf9/0x190 [xfs]
 [<f949b7f3>] xfs_file_aio_write+0xf3/0x160 [xfs]
 [<c115e504>] do_sync_write+0x94/0xd0
 [<c115ed1f>] vfs_write+0x8f/0x160
 [<c115e470>] ? wait_on_retry_sync_kiocb+0x50/0x50
 [<c115f017>] sys_write+0x47/0x80
 [<c15d860d>] sysenter_do_call+0x12/0x28
 .............
 EIP: [<f94a7e8b>] assfail+0x2b/0x30 [xfs] SS:ESP 0068:ea6b5d1c
 ---[ end trace cdd9af4f4ecab42f ]---
 Kernel panic - not syncing: Fatal exception

In order to avoid this, we can evaluate the block_offset of the start
of the page by using shifts rather than masks the mismatch problem.

Thanks Dave Chinner for help finding and fixing this bug.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Quit from splice_write if pipe->nrbufs is 0 for avoiding oops in virtio-serial.

When an application was doing splice from a kernel buffer to virtio-serial on
a guest, the application received signal(SIGINT). This situation will normally
happen, but the kernel executed a kernel panic by oops as follows:

 BUG: unable to handle kernel paging request at ffff882071c8ef28
 IP: [<ffffffff812de48f>] sg_init_table+0x2f/0x50
 PGD 1fac067 PUD 0
 Oops: 0000 [rabeeh#1] SMP
 Modules linked in: lockd sunrpc bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd microcode virtio_balloon virtio_net pcspkr soundcore i2c_piix4 i2c_core uinput floppy
 CPU: 1 PID: 908 Comm: trace-cmd Not tainted 3.10.0+ #49
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 task: ffff880071c64650 ti: ffff88007bf24000 task.ti: ffff88007bf24000
 RIP: 0010:[<ffffffff812de48f>]  [<ffffffff812de48f>] sg_init_table+0x2f/0x50
 RSP: 0018:ffff88007bf25dd8  EFLAGS: 00010286
 RAX: 0000001fffffffe0 RBX: ffff882071c8ef28 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880071c8ef48
 RBP: ffff88007bf25de8 R08: ffff88007fd15d40 R09: ffff880071c8ef48
 R10: ffffea0001c71040 R11: ffffffff8139c555 R12: 0000000000000000
 R13: ffff88007506a3c0 R14: ffff88007c862500 R15: ffff880071c8ef00
 FS:  00007f0a3646c740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff882071c8ef28 CR3: 000000007acbb000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Stack:
  ffff880071c8ef48 ffff88007bf25e20 ffff88007bf25e88 ffffffff8139d6fa
  ffff88007bf25e28 ffffffff8127a3f4 0000000000000000 0000000000000000
  ffff880071c8ef48 0000100000000000 0000000000000003 ffff88007bf25e08
 Call Trace:
  [<ffffffff8139d6fa>] port_fops_splice_write+0xaa/0x130
  [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120
  [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0
  [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110
  [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0
  [<ffffffff8161f8c2>] system_call_fastpath+0x16/0x1b
 Code: c1 e2 05 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 31 f6 48 89 5d f0 48 89 fb e8 8d ce ff ff 41 8d 44 24 ff 48 c1 e0 05 48 01 c3 <48> 8b 03 48 83 e0 fe 48 83 c8 02 48 89 03 48 8b 5d f0 4c 8b 65
 RIP  [<ffffffff812de48f>] sg_init_table+0x2f/0x50
  RSP <ffff88007bf25dd8>
 CR2: ffff882071c8ef28
 ---[ end trace 86323505eb42ea8f ]---

It seems to induce pagefault in sg_init_tabel() when pipe->nrbufs is equal to
zero. This may happen in a following situation:

(1) The application normally does splice(read) from a kernel buffer, then does
    splice(write) to virtio-serial.
(2) The application receives SIGINT when is doing splice(read), so splice(read)
    is failed by EINTR. However, the application does not finish the operation.
(3) The application tries to do splice(write) without pipe->nrbufs.
(4) The virtio-console driver tries to touch scatterlist structure sgl in
    sg_init_table(), but the region is out of bound.

To avoid the case, a kernel should check whether pipe->nrbufs is empty or not
when splice_write is executed in the virtio-console driver.

V3: Add Reviewed-by lines and stable@ line in sign-off area.

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Add pipe_lock/unlock for splice_write to avoid oops by following competition:

(1) An application gets fds of a trace buffer, virtio-serial, pipe.
(2) The application does fork()
(3) The processes execute splice_read(trace buffer) and
    splice_write(virtio-serial) via same pipe.

        <parent>                   <child>
  get fds of a trace buffer,
         virtio-serial, pipe
          |
        fork()----------create--------+
          |                           |
      splice(read)                    |           ---+
      splice(write)                   |              +-- no competition
          |                       splice(read)       |
          |                       splice(write)   ---+
          |                           |
      splice(read)                    |
      splice(write)               splice(read)    ------ competition
          |                       splice(write)

Two processes share a pipe_inode_info structure. If the child execute
splice(read) when the parent tries to execute splice(write), the
structure can be broken. Existing virtio-serial driver does not get
lock for the structure in splice_write, so this competition will induce
oops.

<oops messages>
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
 IP: [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130
 PGD 7223e067 PUD 72391067 PMD 0
 Oops: 0000 [rabeeh#1] SMP
 Modules linked in: lockd bnep bluetooth rfkill sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr virtio_net virtio_balloon i2c_piix4 i2c_core microcode uinput floppy
 CPU: 0 PID: 1072 Comm: compete-test Not tainted 3.10.0ws+ #55
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 task: ffff880071b98000 ti: ffff88007b55e000 task.ti: ffff88007b55e000
 RIP: 0010:[<ffffffff811a6b5f>]  [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130
 RSP: 0018:ffff88007b55fd78  EFLAGS: 00010287
 RAX: 0000000000000000 RBX: ffff88007b55fe20 RCX: 0000000000000000
 RDX: 0000000000001000 RSI: ffff88007a95ba30 RDI: ffff880036f9e6c0
 RBP: ffff88007b55fda8 R08: 00000000000006ec R09: ffff880077626708
 R10: 0000000000000003 R11: ffffffff8139ca59 R12: ffff88007a95ba30
 R13: 0000000000000000 R14: ffffffff8139dd00 R15: ffff880036f9e6c0
 FS:  00007f2e2e3a0740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000018 CR3: 0000000071bd1000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Stack:
  ffffffff8139ca59 ffff88007b55fe20 ffff880036f9e6c0 ffffffff8139dd00
  ffff8800776266c0 ffff880077626708 ffff88007b55fde8 ffffffff811a6e8e
  ffff88007b55fde8 ffffffff8139ca59 ffff880036f9e6c0 ffff88007b55fe20
 Call Trace:
  [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0
  [<ffffffff8139dd00>] ? virtcons_restore+0x100/0x100
  [<ffffffff811a6e8e>] __splice_from_pipe+0x7e/0x90
  [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0
  [<ffffffff8139d739>] port_fops_splice_write+0xe9/0x140
  [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120
  [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0
  [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110
  [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0
  [<ffffffff8161facf>] tracesys+0xdd/0xe2
 Code: 49 8b 87 80 00 00 00 4c 8d 24 d0 8b 53 04 41 8b 44 24 0c 4d 8b 6c 24 10 39 d0 89 03 76 02 89 13 49 8b 44 24 10 4c 89 e6 4c 89 ff <ff> 50 18 85 c0 0f 85 aa 00 00 00 48 89 da 4c 89 e6 4c 89 ff 41
 RIP  [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130
  RSP <ffff88007b55fd78>
 CR2: 0000000000000018
 ---[ end trace 24572beb7764de59 ]---

V2: Fix a locking problem for error
V3: Add Reviewed-by lines and stable@ line in sign-off area

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Commit 8ef6e62 (ARM: footbridge: use
fixed PCI i/o mapping) broke booting on my netwinder.  Before that,
everything boots fine.  Since then, it crashes on boot.

With earlyprintk, I see it BUG-ing like so:
kernel BUG at lib/ioremap.c:27!
Internal error: Oops - BUG: 0 [rabeeh#1] ARM
...
[<c0139b54>] (ioremap_page_range+0x128/0x154) from [<c02e6a6c>] (dc21285_setup+0xd0/0x114)
[<c02e6a6c>] (dc21285_setup+0xd0/0x114) from [<c02e4874>] (pci_common_init+0xa0/0x298)
[<c02e4874>] (pci_common_init+0xa0/0x298) from [<c02e793c>] (netwinder_pci_init+0xc/0x18)
[<c02e793c>] (netwinder_pci_init+0xc/0x18) from [<c02e27d0>] (do_one_initcall+0xb4/0x180)
...

Russell points out it's because of overlapping PCI mappings that was
added with the aforementioned commit.  Rob thought the code would re-use
the static mapping, but that turns out to not be the case and instead
hits the BUG further down.

After deleting this hunk as suggested by Russel, the system boots up fine
again and all my PCI devices work (IDE, ethernet, the DC21285).

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Cc: stable@vger.kernel.org	# v3.5+
Signed-off-by: Olof Johansson <olof@lixom.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ifier

In fd4363f ("x86: Introduce int3 (breakpoint)-based
instruction patching"), the mechanism that was introduced for
notifying alternatives code from int3 exception handler that and
exception occured was die_notifier.

This is however problematic, as early code might be using jump
labels even before the notifier registration has been performed,
which will then lead to an oops due to unhandled exception. One
of such occurences has been encountered by Fengguang:

 int3: 0000 [rabeeh#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc1-01429-g04bf576 linux4kix#8
 task: ffff88000da1b040 ti: ffff88000da1c000 task.ti: ffff88000da1c000
 RIP: 0010:[<ffffffff811098cc>]  [<ffffffff811098cc>] ttwu_do_wakeup+0x28/0x225
 RSP: 0000:ffff88000dd03f10  EFLAGS: 00000006
 RAX: 0000000000000000 RBX: ffff88000dd12940 RCX: ffffffff81769c40
 RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000001
 RBP: ffff88000dd03f28 R08: ffffffff8176a8c0 R09: 0000000000000002
 R10: ffffffff810ff484 R11: ffff88000dd129e8 R12: ffff88000dbc90c0
 R13: ffff88000dbc90c0 R14: ffff88000da1dfd8 R15: ffff88000da1dfd8
 FS:  0000000000000000(0000) GS:ffff88000dd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00000000ffffffff CR3: 0000000001c88000 CR4: 00000000000006e0
 Stack:
  ffff88000dd12940 ffff88000dbc90c0 ffff88000da1dfd8 ffff88000dd03f48
  ffffffff81109e2b ffff88000dd12940 0000000000000000 ffff88000dd03f68
  ffffffff81109e9e 0000000000000000 0000000000012940 ffff88000dd03f98
 Call Trace:
  <IRQ>
  [<ffffffff81109e2b>] ttwu_do_activate.constprop.56+0x6d/0x79
  [<ffffffff81109e9e>] sched_ttwu_pending+0x67/0x84
  [<ffffffff8110c845>] scheduler_ipi+0x15a/0x2b0
  [<ffffffff8104dfb4>] smp_reschedule_interrupt+0x38/0x41
  [<ffffffff8173bf5d>] reschedule_interrupt+0x6d/0x80
  <EOI>
  [<ffffffff810ff484>] ? __atomic_notifier_call_chain+0x5/0xc1
  [<ffffffff8105cc30>] ? native_safe_halt+0xd/0x16
  [<ffffffff81015f10>] default_idle+0x147/0x282
  [<ffffffff81017026>] arch_cpu_idle+0x3d/0x5d
  [<ffffffff81127d6a>] cpu_idle_loop+0x46d/0x5db
  [<ffffffff81127f5c>] cpu_startup_entry+0x84/0x84
  [<ffffffff8104f4f8>] start_secondary+0x3c8/0x3d5
  [...]

Fix this by directly calling poke_int3_handler() from the int3
exception handler (analogically to what ftrace has been doing
already), instead of relying on notifier, registration of which
might not have yet been finalized by the time of the first trap.

Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1307231007490.14024@pobox.suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Fix the following:

BUG: key ffff88043bdd0330 not in .data!
------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
DEBUG_LOCKS_WARN_ON(1)
Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor microcode
CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 rabeeh#1
Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
 0000000000000009 ffff880439a1d920 ffffffff8160a9a9 ffff880439a1d958
 ffffffff8103d9e0 ffff88043af4a510 ffffffff81a16e11 0000000000000000
 ffff88043bdd0330 0000000000000000 ffff880439a1d9b8 ffffffff8103dacc
Call Trace:
  dump_stack
  warn_slowpath_common
  warn_slowpath_fmt
  lockdep_init_map
  ? trace_hardirqs_on_caller
  ? trace_hardirqs_on
  debug_mutex_init
  __mutex_init
  bus_register
  edac_create_sysfs_mci_device
  edac_mc_add_mc
  sbridge_probe
  pci_device_probe
  driver_probe_device
  __driver_attach
  ? driver_probe_device
  bus_for_each_dev
  driver_attach
  bus_add_driver
  driver_register
  __pci_register_driver
  ? 0xffffffffa0010fff
  sbridge_init
  ? 0xffffffffa0010fff
  do_one_initcall
  load_module
  ? unset_module_init_ro_nx
  SyS_init_module
  tracesys
---[ end trace d24a70b0d3ddf733 ]---
EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0
EDAC sbridge: Driver loaded.

What happens is that bus_register needs a statically allocated lock_key
because the last is handed in to lockdep. However, struct mem_ctl_info
embeds struct bus_type (the whole struct, not a pointer to it) and the
whole thing gets dynamically allocated.

Fix this by using a statically allocated struct bus_type for the MC bus.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: stable@kernel.org # v3.10
Signed-off-by: Tony Luck <tony.luck@intel.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Presently, using exynos_defconfig with CONFIG_DEBUG_LL and CONFIG_EARLY_PRIN
on, kernel is not booting, we are getting following:

[    0.000000] ------------[ cut here ]------------
[    0.000000] kernel BUG at mm/vmalloc.c:1134!
[    0.000000] Internal error: Oops - BUG: 0 [rabeeh#1] PREEMPT SMP ARM
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.11.0-rc1 #633
[    0.000000] task: c052ec48 ti: c0524000 task.ti: c0524000
[    0.000000] PC is at vm_area_add_early+0x54/0x94
[    0.000000] LR is at add_static_vm_early+0xc/0x60

Its because exynos[4/5]_map_io() function ioremaps a single 512KB memory
size for all the four uart ports which envelopes the mapping created by
debug_ll_io_init(), called earlier in exynos_init_io().

This patch removes iodesc entries for UART controller for all Samsung SoC's,
since now the Samsung uart driver does a ioremap during probe and any needed
iomapping for earlyprintk will be handled by debug_ll_io_init().

Tested on smdk4412 and smdk5250.

Signed-off-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Commits 6a1c068 and
9356b53, respectively
  'tty: Convert termios_mutex to termios_rwsem' and
  'n_tty: Access termios values safely'
introduced a circular lock dependency with console_lock and
termios_rwsem.

The lockdep report [1] shows that n_tty_write() will attempt
to claim console_lock while holding the termios_rwsem, whereas
tty_do_resize() may already hold the console_lock while
claiming the termios_rwsem.

Since n_tty_write() and tty_do_resize() do not contend
over the same data -- the tty->winsize structure -- correct
the lock dependency by introducing a new lock which
specifically serializes access to tty->winsize only.

[1] Lockdep report

======================================================
[ INFO: possible circular locking dependency detected ]
3.10.0-0+tip-xeon+lockdep #0+tip Not tainted
-------------------------------------------------------
modprobe/277 is trying to acquire lock:
 (&tty->termios_rwsem){++++..}, at: [<ffffffff81452656>] tty_do_resize+0x36/0xe0

but task is already holding lock:
 ((fb_notifier_list).rwsem){.+.+.+}, at: [<ffffffff8107aac6>] __blocking_notifier_call_chain+0x56/0xc0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> rabeeh#2 ((fb_notifier_list).rwsem){.+.+.+}:
       [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
       [<ffffffff8175b797>] down_read+0x47/0x5c
       [<ffffffff8107aac6>] __blocking_notifier_call_chain+0x56/0xc0
       [<ffffffff8107ab46>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff813d7c0b>] fb_notifier_call_chain+0x1b/0x20
       [<ffffffff813d95b2>] register_framebuffer+0x1e2/0x320
       [<ffffffffa01043e1>] drm_fb_helper_initial_config+0x371/0x540 [drm_kms_helper]
       [<ffffffffa01bcb05>] nouveau_fbcon_init+0x105/0x140 [nouveau]
       [<ffffffffa01ad0af>] nouveau_drm_load+0x43f/0x610 [nouveau]
       [<ffffffffa008a79e>] drm_get_pci_dev+0x17e/0x2a0 [drm]
       [<ffffffffa01ad4da>] nouveau_drm_probe+0x25a/0x2a0 [nouveau]
       [<ffffffff813b13db>] local_pci_probe+0x4b/0x80
       [<ffffffff813b1701>] pci_device_probe+0x111/0x120
       [<ffffffff814977eb>] driver_probe_device+0x8b/0x3a0
       [<ffffffff81497bab>] __driver_attach+0xab/0xb0
       [<ffffffff814956ad>] bus_for_each_dev+0x5d/0xa0
       [<ffffffff814971fe>] driver_attach+0x1e/0x20
       [<ffffffff81496cc1>] bus_add_driver+0x111/0x290
       [<ffffffff814982b7>] driver_register+0x77/0x170
       [<ffffffff813b0454>] __pci_register_driver+0x64/0x70
       [<ffffffffa008a9da>] drm_pci_init+0x11a/0x130 [drm]
       [<ffffffffa022a04d>] nouveau_drm_init+0x4d/0x1000 [nouveau]
       [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
       [<ffffffff810c54cb>] load_module+0x123b/0x1bf0
       [<ffffffff810c5f57>] SyS_init_module+0xd7/0x120
       [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

-> rabeeh#1 (console_lock){+.+.+.}:
       [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
       [<ffffffff810430a7>] console_lock+0x77/0x80
       [<ffffffff8146b2a1>] con_flush_chars+0x31/0x50
       [<ffffffff8145780c>] n_tty_write+0x1ec/0x4d0
       [<ffffffff814541b9>] tty_write+0x159/0x2e0
       [<ffffffff814543f5>] redirected_tty_write+0xb5/0xc0
       [<ffffffff811ab9d5>] vfs_write+0xc5/0x1f0
       [<ffffffff811abec5>] SyS_write+0x55/0xa0
       [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

-> #0 (&tty->termios_rwsem){++++..}:
       [<ffffffff810b65c3>] __lock_acquire+0x1c43/0x1d30
       [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
       [<ffffffff8175b724>] down_write+0x44/0x70
       [<ffffffff81452656>] tty_do_resize+0x36/0xe0
       [<ffffffff8146c841>] vc_do_resize+0x3e1/0x4c0
       [<ffffffff8146c99f>] vc_resize+0x1f/0x30
       [<ffffffff813e4535>] fbcon_init+0x385/0x5a0
       [<ffffffff8146a4bc>] visual_init+0xbc/0x120
       [<ffffffff8146cd13>] do_bind_con_driver+0x163/0x320
       [<ffffffff8146cfa1>] do_take_over_console+0x61/0x70
       [<ffffffff813e2b93>] do_fbcon_takeover+0x63/0xc0
       [<ffffffff813e67a5>] fbcon_event_notify+0x715/0x820
       [<ffffffff81762f9d>] notifier_call_chain+0x5d/0x110
       [<ffffffff8107aadc>] __blocking_notifier_call_chain+0x6c/0xc0
       [<ffffffff8107ab46>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff813d7c0b>] fb_notifier_call_chain+0x1b/0x20
       [<ffffffff813d95b2>] register_framebuffer+0x1e2/0x320
       [<ffffffffa01043e1>] drm_fb_helper_initial_config+0x371/0x540 [drm_kms_helper]
       [<ffffffffa01bcb05>] nouveau_fbcon_init+0x105/0x140 [nouveau]
       [<ffffffffa01ad0af>] nouveau_drm_load+0x43f/0x610 [nouveau]
       [<ffffffffa008a79e>] drm_get_pci_dev+0x17e/0x2a0 [drm]
       [<ffffffffa01ad4da>] nouveau_drm_probe+0x25a/0x2a0 [nouveau]
       [<ffffffff813b13db>] local_pci_probe+0x4b/0x80
       [<ffffffff813b1701>] pci_device_probe+0x111/0x120
       [<ffffffff814977eb>] driver_probe_device+0x8b/0x3a0
       [<ffffffff81497bab>] __driver_attach+0xab/0xb0
       [<ffffffff814956ad>] bus_for_each_dev+0x5d/0xa0
       [<ffffffff814971fe>] driver_attach+0x1e/0x20
       [<ffffffff81496cc1>] bus_add_driver+0x111/0x290
       [<ffffffff814982b7>] driver_register+0x77/0x170
       [<ffffffff813b0454>] __pci_register_driver+0x64/0x70
       [<ffffffffa008a9da>] drm_pci_init+0x11a/0x130 [drm]
       [<ffffffffa022a04d>] nouveau_drm_init+0x4d/0x1000 [nouveau]
       [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
       [<ffffffff810c54cb>] load_module+0x123b/0x1bf0
       [<ffffffff810c5f57>] SyS_init_module+0xd7/0x120
       [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

Chain exists of:
  &tty->termios_rwsem --> console_lock --> (fb_notifier_list).rwsem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((fb_notifier_list).rwsem);
                               lock(console_lock);
                               lock((fb_notifier_list).rwsem);
  lock(&tty->termios_rwsem);

 *** DEADLOCK ***

7 locks held by modprobe/277:
 #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff81497b5b>] __driver_attach+0x5b/0xb0
 rabeeh#1:  (&__lockdep_no_validate__){......}, at: [<ffffffff81497b69>] __driver_attach+0x69/0xb0
 rabeeh#2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffa008a6dd>] drm_get_pci_dev+0xbd/0x2a0 [drm]
 rabeeh#3:  (registration_lock){+.+.+.}, at: [<ffffffff813d93f5>] register_framebuffer+0x25/0x320
 rabeeh#4:  (&fb_info->lock){+.+.+.}, at: [<ffffffff813d8116>] lock_fb_info+0x26/0x60
 rabeeh#5:  (console_lock){+.+.+.}, at: [<ffffffff813d95a4>] register_framebuffer+0x1d4/0x320
 rabeeh#6:  ((fb_notifier_list).rwsem){.+.+.+}, at: [<ffffffff8107aac6>] __blocking_notifier_call_chain+0x56/0xc0

stack backtrace:
CPU: 0 PID: 277 Comm: modprobe Not tainted 3.10.0-0+tip-xeon+lockdep #0+tip
Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, BIOS A11 04/30/2012
 ffffffff8213e5e0 ffff8802aa2fb298 ffffffff81755f19 ffff8802aa2fb2e8
 ffffffff8174f506 ffff8802aa2fa000 ffff8802aa2fb378 ffff8802aa2ea8e8
 ffff8802aa2ea910 ffff8802aa2ea8e8 0000000000000006 0000000000000007
Call Trace:
 [<ffffffff81755f19>] dump_stack+0x19/0x1b
 [<ffffffff8174f506>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810b65c3>] __lock_acquire+0x1c43/0x1d30
 [<ffffffff810b775e>] ? mark_held_locks+0xae/0x120
 [<ffffffff810b78d5>] ? trace_hardirqs_on_caller+0x105/0x1d0
 [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
 [<ffffffff81452656>] ? tty_do_resize+0x36/0xe0
 [<ffffffff8175b724>] down_write+0x44/0x70
 [<ffffffff81452656>] ? tty_do_resize+0x36/0xe0
 [<ffffffff81452656>] tty_do_resize+0x36/0xe0
 [<ffffffff8146c841>] vc_do_resize+0x3e1/0x4c0
 [<ffffffff8146c99f>] vc_resize+0x1f/0x30
 [<ffffffff813e4535>] fbcon_init+0x385/0x5a0
 [<ffffffff8146a4bc>] visual_init+0xbc/0x120
 [<ffffffff8146cd13>] do_bind_con_driver+0x163/0x320
 [<ffffffff8146cfa1>] do_take_over_console+0x61/0x70
 [<ffffffff813e2b93>] do_fbcon_takeover+0x63/0xc0
 [<ffffffff813e67a5>] fbcon_event_notify+0x715/0x820
 [<ffffffff81762f9d>] notifier_call_chain+0x5d/0x110
 [<ffffffff8107aadc>] __blocking_notifier_call_chain+0x6c/0xc0
 [<ffffffff8107ab46>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff813d7c0b>] fb_notifier_call_chain+0x1b/0x20
 [<ffffffff813d95b2>] register_framebuffer+0x1e2/0x320
 [<ffffffffa01043e1>] drm_fb_helper_initial_config+0x371/0x540 [drm_kms_helper]
 [<ffffffff8173cbcb>] ? kmemleak_alloc+0x5b/0xc0
 [<ffffffff81198874>] ? kmem_cache_alloc_trace+0x104/0x290
 [<ffffffffa01035e1>] ? drm_fb_helper_single_add_all_connectors+0x81/0xf0 [drm_kms_helper]
 [<ffffffffa01bcb05>] nouveau_fbcon_init+0x105/0x140 [nouveau]
 [<ffffffffa01ad0af>] nouveau_drm_load+0x43f/0x610 [nouveau]
 [<ffffffffa008a79e>] drm_get_pci_dev+0x17e/0x2a0 [drm]
 [<ffffffffa01ad4da>] nouveau_drm_probe+0x25a/0x2a0 [nouveau]
 [<ffffffff8175f162>] ? _raw_spin_unlock_irqrestore+0x42/0x80
 [<ffffffff813b13db>] local_pci_probe+0x4b/0x80
 [<ffffffff813b1701>] pci_device_probe+0x111/0x120
 [<ffffffff814977eb>] driver_probe_device+0x8b/0x3a0
 [<ffffffff81497bab>] __driver_attach+0xab/0xb0
 [<ffffffff81497b00>] ? driver_probe_device+0x3a0/0x3a0
 [<ffffffff814956ad>] bus_for_each_dev+0x5d/0xa0
 [<ffffffff814971fe>] driver_attach+0x1e/0x20
 [<ffffffff81496cc1>] bus_add_driver+0x111/0x290
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffff814982b7>] driver_register+0x77/0x170
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffff813b0454>] __pci_register_driver+0x64/0x70
 [<ffffffffa008a9da>] drm_pci_init+0x11a/0x130 [drm]
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffffa022a04d>] nouveau_drm_init+0x4d/0x1000 [nouveau]
 [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
 [<ffffffff810c54cb>] load_module+0x123b/0x1bf0
 [<ffffffff81399a50>] ? ddebug_proc_open+0xb0/0xb0
 [<ffffffff813855ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff810c5f57>] SyS_init_module+0xd7/0x120
 [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Otherwise we end up dereferencing the already freed net->ipv6.mrt pointer
which leads to a panic (from Srivatsa S. Bhat):

BUG: unable to handle kernel paging request at ffff882018552020
IP: [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
PGD 290a067 PUD 207ffe0067 PMD 207ff1d067 PTE 8000002018552060
Oops: 0000 [rabeeh#1] SMP DEBUG_PAGEALLOC
Modules linked in: ebtable_nat ebtables nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables nfsd lockd nfs_acl exportfs auth_rpcgss autofs4 sunrpc 8021q garp bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
+ip6_tables ipv6 vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii microcode i2c_i801 i2c_core lpc_ich mfd_core shpchp ioatdma dca mlx4_core be2net wmi acpi_cpufreq mperf ext4 jbd2 mbcache dm_mirror dm_region_hash dm_log dm_mod
CPU: 0 PID: 7 Comm: kworker/u33:0 Not tainted 3.11.0-rc1-ea45e-a rabeeh#4
Hardware name: IBM  -[8737R2A]-/00Y2738, BIOS -[B2E120RUS-1.20]- 11/30/2012
Workqueue: netns cleanup_net
task: ffff8810393641c0 ti: ffff881039366000 task.ti: ffff881039366000
RIP: 0010:[<ffffffffa0366b02>]  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
RSP: 0018:ffff881039367bd8  EFLAGS: 00010286
RAX: ffff881039367fd8 RBX: ffff882018552000 RCX: dead000000200200
RDX: 0000000000000000 RSI: ffff881039367b68 RDI: ffff881039367b68
RBP: ffff881039367bf8 R08: ffff881039367b68 R09: 2222222222222222
R10: 2222222222222222 R11: 2222222222222222 R12: ffff882015a7a040
R13: ffff882014eb89c0 R14: ffff8820289e2800 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff882018552020 CR3: 0000000001c0b000 CR4: 00000000000407f0
Stack:
 ffff881039367c18 ffff882014eb89c0 ffff882015e28c00 0000000000000000
 ffff881039367c18 ffffffffa034d9d1 ffff8820289e2800 ffff882014eb89c0
 ffff881039367c58 ffffffff815bdecb ffffffff815bddf2 ffff882014eb89c0
Call Trace:
 [<ffffffffa034d9d1>] rawv6_close+0x21/0x40 [ipv6]
 [<ffffffff815bdecb>] inet_release+0xfb/0x220
 [<ffffffff815bddf2>] ? inet_release+0x22/0x220
 [<ffffffffa032686f>] inet6_release+0x3f/0x50 [ipv6]
 [<ffffffff8151c1d9>] sock_release+0x29/0xa0
 [<ffffffff81525520>] sk_release_kernel+0x30/0x70
 [<ffffffffa034f14b>] icmpv6_sk_exit+0x3b/0x80 [ipv6]
 [<ffffffff8152fff9>] ops_exit_list+0x39/0x60
 [<ffffffff815306fb>] cleanup_net+0xfb/0x1a0
 [<ffffffff81075e3a>] process_one_work+0x1da/0x610
 [<ffffffff81075dc9>] ? process_one_work+0x169/0x610
 [<ffffffff81076390>] worker_thread+0x120/0x3a0
 [<ffffffff81076270>] ? process_one_work+0x610/0x610
 [<ffffffff8107da2e>] kthread+0xee/0x100
 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8162a99c>] ret_from_fork+0x7c/0xb0
 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70
Code: 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 4c 8b 67 30 49 89 fd e8 db 3c 1e e1 49 8b 9c 24 90 08 00 00 48 85 db 74 06 <4c> 39 6b 20 74 20 bb f3 ff ff ff e8 8e 3c 1e e1 89 d8 4c 8b 65
RIP  [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6]
 RSP <ffff881039367bd8>
CR2: ffff882018552020

Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) causes the following undefined instruction error on a mx53 (Cortex-A8):

Internal error: Oops - undefined instruction: 0 [rabeeh#1] SMP ARM
CPU: 0 PID: 275 Comm: modprobe Not tainted 3.11.0-rc2-next-20130722-00009-g9b0f371 #881
task: df46cc00 ti: df48e000 task.ti: df48e000
PC is at check_and_switch_context+0x17c/0x4d0
LR is at check_and_switch_context+0xdc/0x4d0

This problem happens because check_and_switch_context() calls dummy_flush_tlb_a15_erratum() without checking if we are really running on a Cortex-A15 or not.

To avoid this issue, only call dummy_flush_tlb_a15_erratum() inside
check_and_switch_context() if erratum_a15_798181() returns true, which means that we are really running on a Cortex-A15.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When the brcmfmac device is physically removed cfg80211 gives a
warning upon unregistering the net device (see below).

[23052.390197] WARNING: CPU: 0 PID: 30 at net/wireless/core.c:937 cfg80211_netdev_notifier_call+0x164/0x600 [cfg80211]()
[23052.400843] Modules linked in: brcmfmac(O) brcmutil(O) cfg80211(O) pl2303 usbserial binfmt_misc snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event lpc_ich snd_seq snd_timer snd_seq_device snd psmouse mfd_core serio_raw soundcore snd_page_alloc intel_ips dell_laptop dell_wmi sparse_keymap dcdbas nouveau ttm drm_kms_helper drm i2c_algo_bit mxm_wmi ahci libahci sdhci_pci firewire_ohci firewire_core sdhci crc_itu_t mmc_core intel_agp intel_gtt e1000e ptp pps_core agpgart video [last unloaded: brcmfmac]
[23052.452987] CPU: 0 PID: 30 Comm: khubd Tainted: G           O 3.11.0-rc1-wl-testing-lockdep-00002-g41cc093-dirty rabeeh#1
[23052.463480] Hardware name: Dell Inc. Latitude E6410/07XJP9, BIOS A07 02/15/2011
[23052.470852]  00000000 00000000 f4efdc18 c1522e3d f845bed2 f4efdc4 c103fbe4 c16a9254
[23052.478762]  00000000 0000001e f845bed2 000003a9 f841da44 f841da44 f3790004 f25539c0
[23052.486741]  e2700200 f4efdc58 c103fc22 00000009 00000000 f4efdcc0 f841da44 00000002
[23052.494712] Call Trace:
[23052.497165]  [<c1522e3d>] dump_stack+0x4b/0x66
[23052.501685]  [<c103fbe4>] warn_slowpath_common+0x84/0xa0
[23052.507085]  [<f841da44>] ? cfg80211_netdev_notifier_call+0x164/0x600 [cfg80211]
[23052.514542]  [<f841da44>] ? cfg80211_netdev_notifier_call+0x164/0x600 [cfg80211]
[23052.521981]  [<c103fc22>] warn_slowpath_null+0x22/0x30
[23052.527191]  [<f841da44>] cfg80211_netdev_notifier_call+0x164/0x600 [cfg80211]
[23052.534494]  [<c150abe8>] ? packet_notifier+0xc8/0x1d0
[23052.539703]  [<c150abfc>] ? packet_notifier+0xdc/0x1d0
[23052.544880]  [<c150ab20>] ? packet_seq_stop+0x30/0x30
[23052.550002]  [<c152d655>] notifier_call_chain+0x45/0x60
[23052.555298]  [<c106839f>] raw_notifier_call_chain+0x1f/0x30
[23052.560963]  [<c143c693>] call_netdevice_notifiers_info+0x33/0x70
[23052.567153]  [<c1459869>] ? qdisc_destroy+0x99/0xb0
[23052.572116]  [<c143c6e3>] call_netdevice_notifiers+0x13/0x20
[23052.577861]  [<c143df93>] rollback_registered_many+0xf3/0x1d0
[23052.583687]  [<c1524cfc>] ? mutex_lock_nested+0x25c/0x350
[23052.589150]  [<c143e0f4>] rollback_registered+0x24/0x40
[23052.594445]  [<c143e15f>] unregister_netdevice_queue+0x4f/0xb0
[23052.600344]  [<c143e299>] unregister_netdev+0x19/0x30
[23052.605484]  [<f865b38f>] brcmf_del_if+0xbf/0x160 [brcmfmac]
[23052.611223]  [<f865b7ae>] brcmf_detach+0x5e/0xd0 [brcmfmac]
[23052.616881]  [<f8667413>] brcmf_usb_disconnect+0x63/0xa0 [brcmfmac]
[23052.623217]  [<c13e09aa>] usb_unbind_interface+0x4a/0x180

When the device is physically connected the driver sends a disassoc
command to the device and response triggers the driver to inform cfg80211
about it. However, with the device removed the disassoc command fails.
This patch adds a call to cfg80211_disconnected() when that command fails.

The warning was added by commit below and also cleans up, but better
doing it in the driver if only to get rid of the warning.

commit f9bef3d
Author: Ben Greear <greearb@candelatech.com>
Date:   Wed Jun 19 14:06:26 2013 -0700

    wireless: check for dangling wdev->current_bss pointer

Cc: Ben Greear <greearb@candelatech.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
While EEH error happens, we might not have network device instance
(struct net_device) yet. So we can't access the instance safely and
check its link state, which causes kernel crash. The patch fixes it.

EEH: Frozen PE#2 on PHB#3 detected
EEH: This PCI device has failed 1 times in the last hour
EEH: Notify device drivers to shutdown
(NULL net_device): PCI I/O error detected
Unable to handle kernel paging request for data at address 0x00000048
Faulting instruction address: 0xd00000001c9387a8
Oops: Kernel access of bad area, sig: 11 [rabeeh#1]
SMP NR_CPUS=1024 NUMA PowerNV
:
NIP [d00000001c9387a8] .tg3_io_error_detected+0x78/0x2a0 [tg3]
LR [d00000001c9387a4] .tg3_io_error_detected+0x74/0x2a0 [tg3]
Call Trace:
[c000003f93a0f960] [d00000001c9387a4] .tg3_io_error_detected+0x74/0x2a0 [tg3]
[c000003f93a0fa30] [c00000000003844c] .eeh_report_error+0xac/0x120
[c000003f93a0fac0] [c0000000000371bc] .eeh_pe_dev_traverse+0x8c/0x150
[c000003f93a0fb60] [c000000000038858] .eeh_handle_normal_event+0x128/0x3d0
[c000003f93a0fbf0] [c000000000038db8] .eeh_handle_event+0x2b8/0x2c0
[c000003f93a0fc90] [c000000000038e80] .eeh_event_handler+0xc0/0x170
[c000003f93a0fd30] [c0000000000cc000] .kthread+0xf0/0x100
[c000003f93a0fe30] [c00000000000a0dc] .ret_from_kernel_thread+0x5c/0x80

Reported-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Acked-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Requesting external module with cb_lock taken can result in
the deadlock like showed below:

[ 2458.111347] Showing all locks held in the system:
[ 2458.111347] 1 lock held by NetworkManager/582:
[ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40
[ 2458.111347] 1 lock held by modprobe/603:
[ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30

[ 2461.579457] SysRq : Show Blocked State
[ 2461.580103]   task                        PC stack   pid father
[ 2461.580103] NetworkManager  D ffff880034b84500  4040   582      1 0x00000080
[ 2461.580103]  ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8
[ 2461.580103]  ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff
[ 2461.580103]  ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700
[ 2461.580103] Call Trace:
[ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
[ 2461.580103]  [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360
[ 2461.580103]  [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140
[ 2461.580103]  [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50
[ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 2461.580103]  [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170
[ 2461.580103]  [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20
[ 2461.580103]  [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210
[ 2461.580103]  [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170
[ 2461.580103]  [<ffffffff81095cc3>] __request_module+0x1b3/0x370
[ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 2461.580103]  [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190
[ 2461.580103]  [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0
[ 2461.580103]  [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0
[ 2461.580103]  [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0
[ 2461.580103]  [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0
[ 2461.580103]  [<ffffffff8162bc88>] genl_rcv+0x28/0x40
[ 2461.580103]  [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190
[ 2461.580103]  [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750
[ 2461.580103]  [<ffffffff815db849>] sock_sendmsg+0x99/0xd0
[ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
[ 2461.580103]  [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350
[ 2461.580103]  [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0
[ 2461.580103]  [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50
[ 2461.580103]  [<ffffffff810218b9>] ? sched_clock+0x9/0x10
[ 2461.580103]  [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80
[ 2461.580103]  [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100
[ 2461.580103]  [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10
[ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
[ 2461.580103]  [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0
[ 2461.580103]  [<ffffffff8120fec9>] ? fget_light+0xf9/0x510
[ 2461.580103]  [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510
[ 2461.580103]  [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80
[ 2461.580103]  [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20
[ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
[ 2461.580103] modprobe        D ffff88000f2c8000  4632   603    602 0x00000080
[ 2461.580103]  ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8
[ 2461.580103]  ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500
[ 2461.580103]  ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0
[ 2461.580103] Call Trace:
[ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
[ 2461.580103]  [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0
[ 2461.580103]  [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0
[ 2461.580103]  [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20
[ 2461.580103]  [<ffffffff8173492d>] ? down_write+0x9d/0xb2
[ 2461.580103]  [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30
[ 2461.580103]  [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
[ 2461.580103]  [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211]
[ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
[ 2461.580103]  [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211]
[ 2461.580103]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
[ 2461.580103]  [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50
[ 2461.580103]  [<ffffffff810f75af>] load_module+0x1c6f/0x27f0
[ 2461.580103]  [<ffffffff810f2c90>] ? store_uevent+0x40/0x40
[ 2461.580103]  [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0
[ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
[ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 rabeeh#1

Problem start to happen after adding net-pf-16-proto-16-family-nl80211
alias name to cfg80211 module by below commit (though that commit
itself is perfectly fine):

commit fb4e156
Author: Marcel Holtmann <marcel@holtmann.org>
Date:   Sun Apr 28 16:22:06 2013 -0700

    nl80211: Add generic netlink module alias for cfg80211/nl80211

Reported-and-tested-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Reviewed-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
We used to keep the port's char device structs and the /sys entries
around till the last reference to the port was dropped.  This is
actually unnecessary, and resulted in buggy behaviour:

1. Open port in guest
2. Hot-unplug port
3. Hot-plug a port with the same 'name' property as the unplugged one

This resulted in hot-plug being unsuccessful, as a port with the same
name already exists (even though it was unplugged).

This behaviour resulted in a warning message like this one:

-------------------8<---------------------------------------
WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
Hardware name: KVM
sysfs: cannot create duplicate filename
'/devices/pci0000:00/0000:00:04.0/virtio0/virtio-ports/vport0p1'

Call Trace:
 [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106b6f6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff811f2319>] ? sysfs_add_one+0xc9/0x130
 [<ffffffff811f23e8>] ? create_dir+0x68/0xb0
 [<ffffffff811f2469>] ? sysfs_create_dir+0x39/0x50
 [<ffffffff81273129>] ? kobject_add_internal+0xb9/0x260
 [<ffffffff812733d8>] ? kobject_add_varg+0x38/0x60
 [<ffffffff812734b4>] ? kobject_add+0x44/0x70
 [<ffffffff81349de4>] ? get_device_parent+0xf4/0x1d0
 [<ffffffff8134b389>] ? device_add+0xc9/0x650

-------------------8<---------------------------------------

Instead of relying on guest applications to release all references to
the ports, we should go ahead and unregister the port from all the core
layers.  Any open/read calls on the port will then just return errors,
and an unplug/plug operation on the host will succeed as expected.

This also caused buggy behaviour in case of the device removal (not just
a port): when the device was removed (which means all ports on that
device are removed automatically as well), the ports with active
users would clean up only when the last references were dropped -- and
it would be too late then to be referencing char device pointers,
resulting in oopses:

-------------------8<---------------------------------------
PID: 6162   TASK: ffff8801147ad500  CPU: 0   COMMAND: "cat"
 #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b
 rabeeh#1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322
 rabeeh#2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50
 rabeeh#3 [ffff88011b9d5bf0] die at ffffffff8100f26b
 rabeeh#4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2
 rabeeh#5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5
    [exception RIP: strlen+2]
    RIP: ffffffff81272ae2  RSP: ffff88011b9d5d00  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff880118901c18  RCX: 0000000000000000
    RDX: ffff88011799982c  RSI: 00000000000000d0  RDI: 3a303030302f3030
    RBP: ffff88011b9d5d38   R8: 0000000000000006   R9: ffffffffa0134500
    R10: 0000000000001000  R11: 0000000000001000  R12: ffff880117a1cc10
    R13: 00000000000000d0  R14: 0000000000000017  R15: ffffffff81aff700
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 rabeeh#6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d
 linux4kix#7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551
 linux4kix#8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb
 linux4kix#9 [ffff88011b9d5de0] device_del at ffffffff813440c7

-------------------8<---------------------------------------

So clean up when we have all the context, and all that's left to do when
the references to the port have dropped is to free up the port struct
itself.

CC: <stable@vger.kernel.org>
Reported-by: chayang <chayang@redhat.com>
Reported-by: YOGANANTH SUBRAMANIAN <anantyog@in.ibm.com>
Reported-by: FuXiangChun <xfu@redhat.com>
Reported-by: Qunfang Zhang <qzhang@redhat.com>
Reported-by: Sibiao Luo <sluo@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Jarod reported an Oops like when testing with fips=1:

CIFS VFS: could not allocate crypto hmacmd5
CIFS VFS: could not crypto alloc hmacmd5 rc -2
CIFS VFS: Error -2 during NTLMSSP authentication
CIFS VFS: Send error in SessSetup = -2
BUG: unable to handle kernel NULL pointer dereference at 000000000000004e
IP: [<ffffffff812b5c7a>] crypto_destroy_tfm+0x1a/0x90
PGD 0
Oops: 0000 [rabeeh#1] SMP
Modules linked in: md4 nls_utf8 cifs dns_resolver fscache kvm serio_raw virtio_balloon virtio_net mperf i2c_piix4 cirrus drm_kms_helper ttm drm i2c_core virtio_blk ata_generic pata_acpi
CPU: 1 PID: 639 Comm: mount.cifs Not tainted 3.11.0-0.rc3.git0.1.fc20.x86_64 rabeeh#1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff88007bf496e0 ti: ffff88007b080000 task.ti: ffff88007b080000
RIP: 0010:[<ffffffff812b5c7a>]  [<ffffffff812b5c7a>] crypto_destroy_tfm+0x1a/0x90
RSP: 0018:ffff88007b081d10  EFLAGS: 00010282
RAX: 0000000000001f1f RBX: ffff880037422000 RCX: ffff88007b081fd8
RDX: 000000000000001f RSI: 0000000000000006 RDI: fffffffffffffffe
RBP: ffff88007b081d30 R08: ffff880037422000 R09: ffff88007c090100
R10: 0000000000000000 R11: 00000000fffffffe R12: fffffffffffffffe
R13: ffff880037422000 R14: ffff880037422000 R15: 00000000fffffffe
FS:  00007fc322f4f780(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000004e CR3: 000000007bdaa000 CR4: 00000000000006e0
Stack:
 ffffffff81085845 ffff880037422000 ffff8800375e7400 ffff880037422000
 ffff88007b081d48 ffffffffa0176022 ffff880037422000 ffff88007b081d60
 ffffffffa015c07b ffff880037600600 ffff88007b081dc8 ffffffffa01610e1
Call Trace:
 [<ffffffff81085845>] ? __cancel_work_timer+0x75/0xf0
 [<ffffffffa0176022>] cifs_crypto_shash_release+0x82/0xf0 [cifs]
 [<ffffffffa015c07b>] cifs_put_tcp_session+0x8b/0xe0 [cifs]
 [<ffffffffa01610e1>] cifs_mount+0x9d1/0xad0 [cifs]
 [<ffffffffa014ff50>] cifs_do_mount+0xa0/0x4d0 [cifs]
 [<ffffffff811ab6e9>] mount_fs+0x39/0x1b0
 [<ffffffff811c466f>] vfs_kern_mount+0x5f/0xf0
 [<ffffffff811c6a9e>] do_mount+0x23e/0xa20
 [<ffffffff811c66e6>] ? copy_mount_options+0x36/0x170
 [<ffffffff811c7303>] SyS_mount+0x83/0xc0
 [<ffffffff8165c8d9>] system_call_fastpath+0x16/0x1b
Code: eb 9e 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 48 85 ff 74 46 <48> 83 7e 48 00 48 8b 5e 50 74 4b 48 89 f7 e8 83 fc ff ff 4c 8b
RIP  [<ffffffff812b5c7a>] crypto_destroy_tfm+0x1a/0x90
 RSP <ffff88007b081d10>
CR2: 000000000000004e

The cifs code allocates some crypto structures. If that fails, it
returns an error, but it leaves the pointers set to their PTR_ERR
values. Then later when it tries to clean up, it sees that those values
are non-NULL and then passes them to the routine that frees them.

Fix this by setting the pointers to NULL after collecting the error code
in this situation.

Cc: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
vscsi->num_queues counts the number of request virtqueue which does not
include the control and event virtqueue. It is wrong to subtract
VIRTIO_SCSI_VQ_BASE from vscsi->num_queues.

This patch fixes the following panic.

(qemu) device_del scsi0

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
 IP: [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120
 PGD 0
 Oops: 0000 [rabeeh#1] SMP
 Modules linked in:
 CPU: 0 PID: 659 Comm: kworker/0:1 Not tainted 3.11.0-rc2+ #1172
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 Workqueue: kacpi_hotplug _handle_hotplug_event_func
 task: ffff88007bee1cc0 ti: ffff88007bfe4000 task.ti: ffff88007bfe4000
 RIP: 0010:[<ffffffff8179b29f>]  [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120
 RSP: 0018:ffff88007bfe5a38  EFLAGS: 00010202
 RAX: 0000000000000010 RBX: ffff880077fd0d28 RCX: 0000000000000050
 RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000
 RBP: ffff88007bfe5a58 R08: ffff880077f6ff00 R09: 0000000000000001
 R10: ffffffff8143e673 R11: 0000000000000001 R12: 0000000000000001
 R13: ffff880077fd0800 R14: 0000000000000000 R15: ffff88007bf489b0
 FS:  0000000000000000(0000) GS:ffff88007ea00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000020 CR3: 0000000079f8b000 CR4: 00000000000006f0
 Stack:
  ffff880077fd0d28 0000000000000000 ffff880077fd0800 0000000000000008
  ffff88007bfe5a78 ffffffff8179b37d ffff88007bccc800 ffff88007bccc800
  ffff88007bfe5a98 ffffffff8179b3b6 ffff88007bccc800 ffff880077fd0d28
 Call Trace:
  [<ffffffff8179b37d>] virtscsi_set_affinity+0x2d/0x40
  [<ffffffff8179b3b6>] virtscsi_remove_vqs+0x26/0x50
  [<ffffffff8179c7d2>] virtscsi_remove+0x82/0xa0
  [<ffffffff814cb6b2>] virtio_dev_remove+0x22/0x70
  [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0
  [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40
  [<ffffffff8167bb96>] bus_remove_device+0x116/0x150
  [<ffffffff81679936>] device_del+0x126/0x1e0
  [<ffffffff81679a06>] device_unregister+0x16/0x30
  [<ffffffff814cb889>] unregister_virtio_device+0x19/0x30
  [<ffffffff814cdad6>] virtio_pci_remove+0x36/0x80
  [<ffffffff81464ae7>] pci_device_remove+0x37/0x70
  [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0
  [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40
  [<ffffffff8167bb96>] bus_remove_device+0x116/0x150
  [<ffffffff81679936>] device_del+0x126/0x1e0
  [<ffffffff8145edfc>] pci_stop_bus_device+0x9c/0xb0
  [<ffffffff8145f036>] pci_stop_and_remove_bus_device+0x16/0x30
  [<ffffffff81474a9e>] acpiphp_disable_slot+0x8e/0x150
  [<ffffffff81474f6a>] hotplug_event_func+0xba/0x1a0
  [<ffffffff814906c8>] ? acpi_os_release_object+0xe/0x12
  [<ffffffff81475911>] _handle_hotplug_event_func+0x31/0x70
  [<ffffffff810b5333>] process_one_work+0x183/0x500
  [<ffffffff810b66e2>] worker_thread+0x122/0x400
  [<ffffffff810b65c0>] ? manage_workers+0x2d0/0x2d0
  [<ffffffff810bc5de>] kthread+0xce/0xe0
  [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff81ca045c>] ret_from_fork+0x7c/0xb0
  [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70
 Code: 01 00 00 00 74 59 45 31 e4 83 bb c8 01 00 00 02 74 46 66 2e 0f 1f 84 00 00 00 00 00 49 63 c4 48 c1 e0 04 48 8b bc 0
3 10 02 00 00 <48> 8b 47 20 48 8b 80 d0 01 00 00 48 8b 40 50 48 85 c0 74 07 be
 RIP  [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120
  RSP <ffff88007bfe5a38>
 CR2: 0000000000000020
 ---[ end trace 99679331a3775f48 ]---

CC: stable@vger.kernel.org
Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When a probe is being removed, it cleans up the event files that correspond
to the probe. But there is a race between writing to one of these files
and deleting the probe. This is especially true for the "enable" file.

	CPU 0				CPU 1
	-----				-----

				  fd = open("enable",O_WRONLY);

  probes_open()
  release_all_trace_probes()
  unregister_trace_probe()
  if (trace_probe_is_enabled(tp))
	return -EBUSY

				   write(fd, "1", 1)
				   __ftrace_set_clr_event()
				   call->class->reg()
				    (kprobe_register)
				     enable_trace_probe(tp)

  __unregister_trace_probe(tp);
  list_del(&tp->list)
  unregister_probe_event(tp) <-- fails!
  free_trace_probe(tp)

				   write(fd, "0", 1)
				   __ftrace_set_clr_event()
				   call->class->unreg
				    (kprobe_register)
				    disable_trace_probe(tp) <-- BOOM!

A test program was written that used two threads to simulate the
above scenario adding a nanosleep() interval to change the timings
and after several thousand runs, it was able to trigger this bug
and crash:

BUG: unable to handle kernel paging request at 00000005000000f9
IP: [<ffffffff810dee70>] probes_open+0x3b/0xa7
PGD 7808a067 PUD 0
Oops: 0000 [rabeeh#1] PREEMPT SMP
Dumping ftrace buffer:
---------------------------------
Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6
CPU: 1 PID: 2070 Comm: test-kprobe-rem Not tainted 3.11.0-rc3-test+ #47
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
task: ffff880077756440 ti: ffff880076e52000 task.ti: ffff880076e52000
RIP: 0010:[<ffffffff810dee70>]  [<ffffffff810dee70>] probes_open+0x3b/0xa7
RSP: 0018:ffff880076e53c38  EFLAGS: 00010203
RAX: 0000000500000001 RBX: ffff88007844f440 RCX: 0000000000000003
RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880076e52000
RBP: ffff880076e53c58 R08: ffff880076e53bd8 R09: 0000000000000000
R10: ffff880077756440 R11: 0000000000000006 R12: ffffffff810dee35
R13: ffff880079250418 R14: 0000000000000000 R15: ffff88007844f450
FS:  00007f87a276f700(0000) GS:ffff88007d480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000005000000f9 CR3: 0000000077262000 CR4: 00000000000007e0
Stack:
 ffff880076e53c58 ffffffff81219ea0 ffff88007844f440 ffffffff810dee35
 ffff880076e53ca8 ffffffff81130f78 ffff8800772986c0 ffff8800796f93a0
 ffffffff81d1b5d8 ffff880076e53e04 0000000000000000 ffff88007844f440
Call Trace:
 [<ffffffff81219ea0>] ? security_file_open+0x2c/0x30
 [<ffffffff810dee35>] ? unregister_trace_probe+0x4b/0x4b
 [<ffffffff81130f78>] do_dentry_open+0x162/0x226
 [<ffffffff81131186>] finish_open+0x46/0x54
 [<ffffffff8113f30b>] do_last+0x7f6/0x996
 [<ffffffff8113cc6f>] ? inode_permission+0x42/0x44
 [<ffffffff8113f6dd>] path_openat+0x232/0x496
 [<ffffffff8113fc30>] do_filp_open+0x3a/0x8a
 [<ffffffff8114ab32>] ? __alloc_fd+0x168/0x17a
 [<ffffffff81131f4e>] do_sys_open+0x70/0x102
 [<ffffffff8108f06e>] ? trace_hardirqs_on_caller+0x160/0x197
 [<ffffffff81131ffe>] SyS_open+0x1e/0x20
 [<ffffffff81522742>] system_call_fastpath+0x16/0x1b
Code: e5 41 54 53 48 89 f3 48 83 ec 10 48 23 56 78 48 39 c2 75 6c 31 f6 48 c7
RIP  [<ffffffff810dee70>] probes_open+0x3b/0xa7
 RSP <ffff880076e53c38>
CR2: 00000005000000f9
---[ end trace 35f17d68fc569897 ]---

The unregister_trace_probe() must be done first, and if it fails it must
fail the removal of the kprobe.

Several changes have already been made by Oleg Nesterov and Masami Hiramatsu
to allow moving the unregister_probe_event() before the removal of
the probe and exit the function if it fails. This prevents the tp
structure from being used after it is freed.

Link: http://lkml.kernel.org/r/20130704034038.819592356@goodmis.org

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When creation of TIPC internal server socket fails,
we get an oops with the following dump:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc]
PGD 13719067 PUD 12008067 PMD 0
Oops: 0000 [rabeeh#1] SMP DEBUG_PAGEALLOC
Modules linked in: tipc(+)
CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ rabeeh#1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000
RIP: 0010:[<ffffffffa0011f49>]  [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc]
RSP: 0018:ffff88001374dc98  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000
RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000
RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0
R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8
FS:  0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0
Stack:
ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001
ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea
ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150
Call Trace:
[<ffffffffa0012922>] tipc_server_stop+0xa2/0x1b0 [tipc]
[<ffffffffa0009995>] tipc_subscr_stop+0x15/0x20 [tipc]
[<ffffffffa00130f5>] tipc_core_stop+0x1d/0x33 [tipc]
[<ffffffffa001f0d4>] tipc_init+0xd4/0xf8 [tipc]
[<ffffffffa001f000>] ? 0xffffffffa001efff
[<ffffffff8100023f>] do_one_initcall+0x3f/0x150
[<ffffffff81082f4d>] ? __blocking_notifier_call_chain+0x7d/0xd0
[<ffffffff810cc58a>] load_module+0x11aa/0x19c0
[<ffffffff810c8d60>] ? show_initstate+0x50/0x50
[<ffffffff8190311c>] ? retint_restore_args+0xe/0xe
[<ffffffff810cce79>] SyS_init_module+0xd9/0x110
[<ffffffff8190dc65>] sysenter_dispatch+0x7/0x1f
Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24
b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 <4c> 8b 68 20 4d 8d a5 48 03 00
00 4c 89 e7 e8 04 05 8f e1 4c 89
RIP  [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc]
RSP <ffff88001374dc98>
CR2: 0000000000000020
---[ end trace b02321f40e4269a3 ]---

We have the following call chain:

tipc_core_start()
    ret = tipc_subscr_start()
        ret = tipc_server_start(){
                  server->enabled = 1;
                  ret = tipc_open_listening_sock()
              }

I.e., the server->enabled flag is unconditionally set to 1, whatever
the return value of tipc_open_listening_sock().

This causes a crash when tipc_core_start() tries to clean up
resources after a failed initialization:

    if (ret == failed)
        tipc_subscr_stop()
            tipc_server_stop(){
                if (server->enabled)
                    tipc_close_conn(){
                        NULL reference of con->sock-sk
                        OOPS!
                }
            }

To avoid this, tipc_server_start() should only set server->enabled
to 1 in case of a succesful socket creation. In case of failure, it
should release all allocated resources before returning.

Problem introduced in commit c5fa7b3
("tipc: introduce new TIPC server infrastructure") in v3.11-rc1.
Note that it won't be seen often; it takes a module load under memory
constrained conditions in order to trigger the failure condition.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
1)Use kvmap_itlb_longpath instead of kvmap_dtlb_longpath.

2)Handle page #0 only, don't handle page rabeeh#1: bleu -> blu

 (KERNBASE is 0x400000, so rabeeh#1 does not exist too. But everything
  is possible in the future. Fix to not to have problems later.)

3)Remove unused kvmap_itlb_nonlinear.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ike page)

Unfortunately, I never committed the fix to a nasty oops which can
occur as a result of that commit:

------------[ cut here ]------------
kernel BUG at /home/olof/work/batch/include/linux/mm.h:414!
Internal error: Oops - BUG: 0 [rabeeh#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 490 Comm: killall5 Not tainted 3.11.0-rc3-00288-gabe0308 #53
task: e90acac0 ti: e9be8000 task.ti: e9be8000
PC is at special_mapping_fault+0xa4/0xc4
LR is at __do_fault+0x68/0x48c

This doesn't show up unless you do quite a bit of testing; a simple
boot test does not do this, so all my nightly tests were passing fine.

The reason for this is that install_special_mapping() expects the
page array to stick around, and as this was only inserting one page
which was stored on the kernel stack, that's why this was blowing up.

Reported-by: Olof Johansson <olof@lixom.net>
Tested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Since commit ac4e4af ("macvtap: Consistently use rcu functions"),
Thomas gets two different warnings :

BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45891/45892
caller is macvtap_do_read+0x45c/0x600 [macvtap]
CPU: 1 PID: 45892 Comm: vhost-45891 Not tainted 3.11.0-bisecttest linux4kix#13
Call Trace:
([<00000000001126ee>] show_trace+0x126/0x144)
 [<00000000001127d2>] show_stack+0xc6/0xd4
 [<000000000068bcec>] dump_stack+0x74/0xd8
 [<0000000000481066>] debug_smp_processor_id+0xf6/0x114
 [<000003ff802e9a18>] macvtap_do_read+0x45c/0x600 [macvtap]
 [<000003ff802e9c1c>] macvtap_recvmsg+0x60/0x88 [macvtap]
 [<000003ff80318c5e>] handle_rx+0x5b2/0x800 [vhost_net]
 [<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
 [<000000000015f3ac>] kthread+0xd8/0xe4
 [<00000000006934a6>] kernel_thread_starter+0x6/0xc
 [<00000000006934a0>] kernel_thread_starter+0x0/0xc

And

BUG: using smp_processor_id() in preemptible [00000000] code: vhost-45897/45898
caller is macvlan_start_xmit+0x10a/0x1b4 [macvlan]
CPU: 1 PID: 45898 Comm: vhost-45897 Not tainted 3.11.0-bisecttest linux4kix#16
Call Trace:
([<00000000001126ee>] show_trace+0x126/0x144)
 [<00000000001127d2>] show_stack+0xc6/0xd4
 [<000000000068bdb8>] dump_stack+0x74/0xd4
 [<0000000000481132>] debug_smp_processor_id+0xf6/0x114
 [<000003ff802b72ca>] macvlan_start_xmit+0x10a/0x1b4 [macvlan]
 [<000003ff802ea69a>] macvtap_get_user+0x982/0xbc4 [macvtap]
 [<000003ff802ea92a>] macvtap_sendmsg+0x4e/0x60 [macvtap]
 [<000003ff8031947c>] handle_tx+0x494/0x5ec [vhost_net]
 [<000003ff8028f77c>] vhost_worker+0x15c/0x1c4 [vhost]
 [<000000000015f3ac>] kthread+0xd8/0xe4
 [<000000000069356e>] kernel_thread_starter+0x6/0xc
 [<0000000000693568>] kernel_thread_starter+0x0/0xc
2 locks held by vhost-45897/45898:
 #0:  (&vq->mutex){+.+.+.}, at: [<000003ff8031903c>] handle_tx+0x54/0x5ec [vhost_net]
 rabeeh#1:  (rcu_read_lock){.+.+..}, at: [<000003ff802ea53c>] macvtap_get_user+0x824/0xbc4 [macvtap]

In the first case, macvtap_put_user() calls macvlan_count_rx()
in a preempt-able context, and this is not allowed.

In the second case, macvtap_get_user() calls
macvlan_start_xmit() with BH enabled, and this is not allowed.

Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Bisected-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
We met lockdep warning when enable and disable the bearer for commands such as:

tipc-config -netid=1234 -addr=1.1.3 -be=eth:eth0
tipc-config -netid=1234 -addr=1.1.3 -bd=eth:eth0

---------------------------------------------------

[  327.693595] ======================================================
[  327.693994] [ INFO: possible circular locking dependency detected ]
[  327.694519] 3.11.0-rc3-wwd-default rabeeh#4 Tainted: G           O
[  327.694882] -------------------------------------------------------
[  327.695385] tipc-config/5825 is trying to acquire lock:
[  327.695754]  (((timer))rabeeh#2){+.-...}, at: [<ffffffff8105be80>] del_timer_sync+0x0/0xd0
[  327.696018]
[  327.696018] but task is already holding lock:
[  327.696018]  (&(&b_ptr->lock)->rlock){+.-...}, at: [<ffffffffa02be58d>] bearer_disable+  0xdd/0x120 [tipc]
[  327.696018]
[  327.696018] which lock already depends on the new lock.
[  327.696018]
[  327.696018]
[  327.696018] the existing dependency chain (in reverse order) is:
[  327.696018]
[  327.696018] -> rabeeh#1 (&(&b_ptr->lock)->rlock){+.-...}:
[  327.696018]        [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
[  327.696018]        [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
[  327.696018]        [<ffffffff810b4453>] lock_acquire+0x103/0x130
[  327.696018]        [<ffffffff814d65b1>] _raw_spin_lock_bh+0x41/0x80
[  327.696018]        [<ffffffffa02c5d48>] disc_timeout+0x18/0xd0 [tipc]
[  327.696018]        [<ffffffff8105b92a>] call_timer_fn+0xda/0x1e0
[  327.696018]        [<ffffffff8105bcd7>] run_timer_softirq+0x2a7/0x2d0
[  327.696018]        [<ffffffff8105379a>] __do_softirq+0x16a/0x2e0
[  327.696018]        [<ffffffff81053a35>] irq_exit+0xd5/0xe0
[  327.696018]        [<ffffffff81033005>] smp_apic_timer_interrupt+0x45/0x60
[  327.696018]        [<ffffffff814df4af>] apic_timer_interrupt+0x6f/0x80
[  327.696018]        [<ffffffff8100b70e>] arch_cpu_idle+0x1e/0x30
[  327.696018]        [<ffffffff810a039d>] cpu_idle_loop+0x1fd/0x280
[  327.696018]        [<ffffffff810a043e>] cpu_startup_entry+0x1e/0x20
[  327.696018]        [<ffffffff81031589>] start_secondary+0x89/0x90
[  327.696018]
[  327.696018] -> #0 (((timer))rabeeh#2){+.-...}:
[  327.696018]        [<ffffffff810b33fe>] check_prev_add+0x43e/0x4b0
[  327.696018]        [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
[  327.696018]        [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
[  327.696018]        [<ffffffff810b4453>] lock_acquire+0x103/0x130
[  327.696018]        [<ffffffff8105bebd>] del_timer_sync+0x3d/0xd0
[  327.696018]        [<ffffffffa02c5855>] tipc_disc_delete+0x15/0x30 [tipc]
[  327.696018]        [<ffffffffa02be59f>] bearer_disable+0xef/0x120 [tipc]
[  327.696018]        [<ffffffffa02be74f>] tipc_disable_bearer+0x2f/0x60 [tipc]
[  327.696018]        [<ffffffffa02bfb32>] tipc_cfg_do_cmd+0x2e2/0x550 [tipc]
[  327.696018]        [<ffffffffa02c8c79>] handle_cmd+0x49/0xe0 [tipc]
[  327.696018]        [<ffffffff8143e898>] genl_family_rcv_msg+0x268/0x340
[  327.696018]        [<ffffffff8143ed30>] genl_rcv_msg+0x70/0xd0
[  327.696018]        [<ffffffff8143d4c9>] netlink_rcv_skb+0x89/0xb0
[  327.696018]        [<ffffffff8143e617>] genl_rcv+0x27/0x40
[  327.696018]        [<ffffffff8143d21e>] netlink_unicast+0x15e/0x1b0
[  327.696018]        [<ffffffff8143ddcf>] netlink_sendmsg+0x22f/0x400
[  327.696018]        [<ffffffff813f7836>] __sock_sendmsg+0x66/0x80
[  327.696018]        [<ffffffff813f7957>] sock_aio_write+0x107/0x120
[  327.696018]        [<ffffffff8117f76d>] do_sync_write+0x7d/0xc0
[  327.696018]        [<ffffffff8117fc56>] vfs_write+0x186/0x190
[  327.696018]        [<ffffffff811803e0>] SyS_write+0x60/0xb0
[  327.696018]        [<ffffffff814de852>] system_call_fastpath+0x16/0x1b
[  327.696018]
[  327.696018] other info that might help us debug this:
[  327.696018]
[  327.696018]  Possible unsafe locking scenario:
[  327.696018]
[  327.696018]        CPU0                    CPU1
[  327.696018]        ----                    ----
[  327.696018]   lock(&(&b_ptr->lock)->rlock);
[  327.696018]                                lock(((timer))rabeeh#2);
[  327.696018]                                lock(&(&b_ptr->lock)->rlock);
[  327.696018]   lock(((timer))rabeeh#2);
[  327.696018]
[  327.696018]  *** DEADLOCK ***
[  327.696018]
[  327.696018] 5 locks held by tipc-config/5825:
[  327.696018]  #0:  (cb_lock){++++++}, at: [<ffffffff8143e608>] genl_rcv+0x18/0x40
[  327.696018]  rabeeh#1:  (genl_mutex){+.+.+.}, at: [<ffffffff8143ed66>] genl_rcv_msg+0xa6/0xd0
[  327.696018]  rabeeh#2:  (config_mutex){+.+.+.}, at: [<ffffffffa02bf889>] tipc_cfg_do_cmd+0x39/ 0x550 [tipc]
[  327.696018]  rabeeh#3:  (tipc_net_lock){++.-..}, at: [<ffffffffa02be738>] tipc_disable_bearer+ 0x18/0x60 [tipc]
[  327.696018]  rabeeh#4:  (&(&b_ptr->lock)->rlock){+.-...}, at: [<ffffffffa02be58d>]             bearer_disable+0xdd/0x120 [tipc]
[  327.696018]
[  327.696018] stack backtrace:
[  327.696018] CPU: 2 PID: 5825 Comm: tipc-config Tainted: G           O 3.11.0-rc3-wwd-    default rabeeh#4
[  327.696018] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  327.696018]  00000000ffffffff ffff880037fa77a8 ffffffff814d03dd 0000000000000000
[  327.696018]  ffff880037fa7808 ffff880037fa77e8 ffffffff810b1c4f 0000000037fa77e8
[  327.696018]  ffff880037fa7808 ffff880037e4db40 0000000000000000 ffff880037e4e318
[  327.696018] Call Trace:
[  327.696018]  [<ffffffff814d03dd>] dump_stack+0x4d/0xa0
[  327.696018]  [<ffffffff810b1c4f>] print_circular_bug+0x10f/0x120
[  327.696018]  [<ffffffff810b33fe>] check_prev_add+0x43e/0x4b0
[  327.696018]  [<ffffffff810b3b4d>] validate_chain+0x6dd/0x870
[  327.696018]  [<ffffffff81087a28>] ? sched_clock_cpu+0xd8/0x110
[  327.696018]  [<ffffffff810b40bb>] __lock_acquire+0x3db/0x670
[  327.696018]  [<ffffffff810b4453>] lock_acquire+0x103/0x130
[  327.696018]  [<ffffffff8105be80>] ? try_to_del_timer_sync+0x70/0x70
[  327.696018]  [<ffffffff8105bebd>] del_timer_sync+0x3d/0xd0
[  327.696018]  [<ffffffff8105be80>] ? try_to_del_timer_sync+0x70/0x70
[  327.696018]  [<ffffffffa02c5855>] tipc_disc_delete+0x15/0x30 [tipc]
[  327.696018]  [<ffffffffa02be59f>] bearer_disable+0xef/0x120 [tipc]
[  327.696018]  [<ffffffffa02be74f>] tipc_disable_bearer+0x2f/0x60 [tipc]
[  327.696018]  [<ffffffffa02bfb32>] tipc_cfg_do_cmd+0x2e2/0x550 [tipc]
[  327.696018]  [<ffffffff81218783>] ? security_capable+0x13/0x20
[  327.696018]  [<ffffffffa02c8c79>] handle_cmd+0x49/0xe0 [tipc]
[  327.696018]  [<ffffffff8143e898>] genl_family_rcv_msg+0x268/0x340
[  327.696018]  [<ffffffff8143ed30>] genl_rcv_msg+0x70/0xd0
[  327.696018]  [<ffffffff8143ecc0>] ? genl_lock+0x20/0x20
[  327.696018]  [<ffffffff8143d4c9>] netlink_rcv_skb+0x89/0xb0
[  327.696018]  [<ffffffff8143e608>] ? genl_rcv+0x18/0x40
[  327.696018]  [<ffffffff8143e617>] genl_rcv+0x27/0x40
[  327.696018]  [<ffffffff8143d21e>] netlink_unicast+0x15e/0x1b0
[  327.696018]  [<ffffffff81289d7c>] ? memcpy_fromiovec+0x6c/0x90
[  327.696018]  [<ffffffff8143ddcf>] netlink_sendmsg+0x22f/0x400
[  327.696018]  [<ffffffff813f7836>] __sock_sendmsg+0x66/0x80
[  327.696018]  [<ffffffff813f7957>] sock_aio_write+0x107/0x120
[  327.696018]  [<ffffffff813fe29c>] ? release_sock+0x8c/0xa0
[  327.696018]  [<ffffffff8117f76d>] do_sync_write+0x7d/0xc0
[  327.696018]  [<ffffffff8117fa24>] ? rw_verify_area+0x54/0x100
[  327.696018]  [<ffffffff8117fc56>] vfs_write+0x186/0x190
[  327.696018]  [<ffffffff811803e0>] SyS_write+0x60/0xb0
[  327.696018]  [<ffffffff814de852>] system_call_fastpath+0x16/0x1b

-----------------------------------------------------------------------

The problem is that the tipc_link_delete() will cancel the timer disc_timeout() when
the b_ptr->lock is hold, but the disc_timeout() still call b_ptr->lock to finish the
work, so the dead lock occurs.

We should unlock the b_ptr->lock when del the disc_timeout().

Remove link_timeout() still met the same problem, the patch:

http://article.gmane.org/gmane.network.tipc.general/4380

fix the problem, so no need to send patch for fix link_timeout() deadlock warming.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Lockdep reports a circular lock dependency between
atomic_read_lock and termios_rwsem [1]. However, a lock
order deadlock is not possible since CPU1 only holds a
read lock which cannot prevent CPU0 from also acquiring
a read lock on the same r/w semaphore.

Unfortunately, lockdep cannot currently distinguish whether
the locks are read or write for any particular lock graph,
merely that the locks _were_ previously read and/or write.

Until lockdep is fixed, re-order atomic_read_lock so
termios_rwsem can be dropped and reacquired without
triggering lockdep.

Patch based on original posted here https://lkml.org/lkml/2013/8/1/510
by Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

[1] Initial lockdep report from Artem Savkov <artem.savkov@gmail.com>

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.11.0-rc3-next-20130730+ #140 Tainted: G        W
 -------------------------------------------------------
 bash/1198 is trying to acquire lock:
  (&tty->termios_rwsem){++++..}, at: [<ffffffff816aa3bb>] n_tty_read+0x49b/0x660

 but task is already holding lock:
  (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff816aa0f0>] n_tty_read+0x1d0/0x660

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> rabeeh#1 (&ldata->atomic_read_lock){+.+...}:
        [<ffffffff811111cc>] validate_chain+0x73c/0x850
        [<ffffffff811117e0>] __lock_acquire+0x500/0x5d0
        [<ffffffff81111a29>] lock_acquire+0x179/0x1d0
        [<ffffffff81d34b9c>] mutex_lock_interruptible_nested+0x7c/0x540
        [<ffffffff816aa0f0>] n_tty_read+0x1d0/0x660
        [<ffffffff816a3bb6>] tty_read+0x86/0xf0
        [<ffffffff811f21d3>] vfs_read+0xc3/0x130
        [<ffffffff811f2702>] SyS_read+0x62/0xa0
        [<ffffffff81d45259>] system_call_fastpath+0x16/0x1b

 -> #0 (&tty->termios_rwsem){++++..}:
        [<ffffffff8111064f>] check_prev_add+0x14f/0x590
        [<ffffffff811111cc>] validate_chain+0x73c/0x850
        [<ffffffff811117e0>] __lock_acquire+0x500/0x5d0
        [<ffffffff81111a29>] lock_acquire+0x179/0x1d0
        [<ffffffff81d372c1>] down_read+0x51/0xa0
        [<ffffffff816aa3bb>] n_tty_read+0x49b/0x660
        [<ffffffff816a3bb6>] tty_read+0x86/0xf0
        [<ffffffff811f21d3>] vfs_read+0xc3/0x130
        [<ffffffff811f2702>] SyS_read+0x62/0xa0
        [<ffffffff81d45259>] system_call_fastpath+0x16/0x1b

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&ldata->atomic_read_lock);
                                lock(&tty->termios_rwsem);
                                lock(&ldata->atomic_read_lock);
   lock(&tty->termios_rwsem);

  *** DEADLOCK ***

 2 locks held by bash/1198:
  #0:  (&tty->ldisc_sem){.+.+.+}, at: [<ffffffff816ade04>] tty_ldisc_ref_wait+0x24/0x60
  rabeeh#1:  (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff816aa0f0>] n_tty_read+0x1d0/0x660

 stack backtrace:
 CPU: 1 PID: 1198 Comm: bash Tainted: G        W    3.11.0-rc3-next-20130730+ #140
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
  0000000000000000 ffff880019acdb28 ffffffff81d34074 0000000000000002
  0000000000000000 ffff880019acdb78 ffffffff8110ed75 ffff880019acdb98
  ffff880019fd0000 ffff880019acdb78 ffff880019fd0638 ffff880019fd0670
 Call Trace:
  [<ffffffff81d34074>] dump_stack+0x59/0x7d
  [<ffffffff8110ed75>] print_circular_bug+0x105/0x120
  [<ffffffff8111064f>] check_prev_add+0x14f/0x590
  [<ffffffff81d3ab5f>] ? _raw_spin_unlock_irq+0x4f/0x70
  [<ffffffff811111cc>] validate_chain+0x73c/0x850
  [<ffffffff8110ae0f>] ? trace_hardirqs_off_caller+0x1f/0x190
  [<ffffffff811117e0>] __lock_acquire+0x500/0x5d0
  [<ffffffff81111a29>] lock_acquire+0x179/0x1d0
  [<ffffffff816aa3bb>] ? n_tty_read+0x49b/0x660
  [<ffffffff81d372c1>] down_read+0x51/0xa0
  [<ffffffff816aa3bb>] ? n_tty_read+0x49b/0x660
  [<ffffffff816aa3bb>] n_tty_read+0x49b/0x660
  [<ffffffff810e4130>] ? try_to_wake_up+0x210/0x210
  [<ffffffff816a3bb6>] tty_read+0x86/0xf0
  [<ffffffff811f21d3>] vfs_read+0xc3/0x130
  [<ffffffff811f2702>] SyS_read+0x62/0xa0
  [<ffffffff815e24ee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff81d45259>] system_call_fastpath+0x16/0x1b

Reported-by: Artem Savkov <artem.savkov@gmail.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ent()

Vince Weaver reports an oops in the ARM perf event code while
running his perf_fuzzer tool on a pandaboard running v3.11-rc4.

Unable to handle kernel paging request at virtual address 73fd14cc
pgd = eca6c000
[73fd14cc] *pgd=00000000
Internal error: Oops: 5 [rabeeh#1] SMP ARM
Modules linked in: snd_soc_omap_hdmi omapdss snd_soc_omap_abe_twl6040 snd_soc_twl6040 snd_soc_omap snd_soc_omap_hdmi_card snd_soc_omap_mcpdm snd_soc_omap_mcbsp snd_soc_core snd_compress regmap_spi snd_pcm snd_page_alloc snd_timer snd soundcore
CPU: 1 PID: 2790 Comm: perf_fuzzer Not tainted 3.11.0-rc4 rabeeh#6
task: eddcab80 ti: ed892000 task.ti: ed892000
PC is at armpmu_map_event+0x20/0x88
LR is at armpmu_event_init+0x38/0x280
pc : [<c001c3e4>]    lr : [<c001c17c>]    psr: 60000013
sp : ed893e40  ip : ecececec  fp : edfaec00
r10: 00000000  r9 : 00000000  r8 : ed8c3ac0
r7 : ed8c3b5c  r6 : edfaec00  r5 : 00000000  r4 : 00000000
r3 : 000000ff  r2 : c0496144  r1 : c049611c  r0 : edfaec00
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: aca6c04a  DAC: 00000015
Process perf_fuzzer (pid: 2790, stack limit = 0xed892240)
Stack: (0xed893e40 to 0xed894000)
3e40: 00000800 c001c17c 00000002 c008a748 00000001 00000000 00000000 c00bf078
3e60: 00000000 edfaee50 00000000 00000000 00000000 edfaec00 ed8c3ac0 edfaec00
3e80: 00000000 c073ffac ed893f20 c00bf180 00000001 00000000 c00bf078 ed893f20
3ea0: 00000000 ed8c3ac0 00000000 00000000 00000000 c0cb0818 eddcab80 c00bf440
3ec0: ed893f20 00000000 eddcab80 eca76800 00000000 eca76800 00000000 00000000
3ee0: 00000000 ec984c80 eddcab80 c00bfe68 00000000 00000000 00000000 00000080
3f00: 00000000 ed892000 00000000 ed892030 00000004 ecc7e3c8 ecc7e3c8 00000000
3f20: 00000000 00000048 ecececec 00000000 00000000 00000000 00000000 00000000
3f40: 00000000 00000000 00297810 00000000 00000000 00000000 00000000 00000000
3f60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3f80: 00000002 00000002 000103a4 00000002 0000016c c00128e8 ed892000 00000000
3fa0: 00090998 c0012700 00000002 000103a4 00090ab8 00000000 00000000 0000000f
3fc0: 00000002 000103a4 00000002 0000016c 00090ab0 00090ab8 000107a0 00090998
3fe0: bed92be0 bed92bd0 0000b785 b6e8f6d0 40000010 00090ab8 00000000 00000000
[<c001c3e4>] (armpmu_map_event+0x20/0x88) from [<c001c17c>] (armpmu_event_init+0x38/0x280)
[<c001c17c>] (armpmu_event_init+0x38/0x280) from [<c00bf180>] (perf_init_event+0x108/0x180)
[<c00bf180>] (perf_init_event+0x108/0x180) from [<c00bf440>] (perf_event_alloc+0x248/0x40c)
[<c00bf440>] (perf_event_alloc+0x248/0x40c) from [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc)
[<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc) from [<c0012700>] (ret_fast_syscall+0x0/0x48)
Code: 0a000005 e3540004 0a000016 e3540000 (0791010c)

This is because event->attr.config in armpmu_event_init()
contains a very large number copied directly from userspace and
is never checked against the size of the array indexed in
armpmu_map_hw_event(). Fix the problem by checking the value of
config before indexing the array and rejecting invalid config
values.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
This patch adds a base infrastructure that allows SCTP to do
memory accounting for control chunks.  Real accounting code will
follow.

This patch alos fixes the following triggered bug ...

[  553.109742] kernel BUG at include/linux/skbuff.h:1813!
[  553.109766] invalid opcode: 0000 [rabeeh#1] SMP
[  553.109789] Modules linked in: sctp libcrc32c rfcomm [...]
[  553.110259]  uinput i915 i2c_algo_bit drm_kms_helper e1000e drm ptp
pps_core i2c_core wmi video sunrpc
[  553.110320] CPU: 0 PID: 1636 Comm: lt-test_1_to_1_ Not tainted
3.11.0-rc3+ rabeeh#2
[  553.110350] Hardware name: LENOVO 74597D6/74597D6, BIOS 6DET60WW
(3.10 ) 09/17/2009
[  553.110381] task: ffff88020a01dd40 ti: ffff880204ed0000 task.ti:
ffff880204ed0000
[  553.110411] RIP: 0010:[<ffffffffa0698017>]  [<ffffffffa0698017>]
skb_orphan.part.9+0x4/0x6 [sctp]
[  553.110459] RSP: 0018:ffff880204ed1bb8  EFLAGS: 00010286
[  553.110483] RAX: ffff8802086f5a40 RBX: ffff880204303300 RCX:
0000000000000000
[  553.110487] RDX: ffff880204303c28 RSI: ffff8802086f5a40 RDI:
ffff880202158000
[  553.110487] RBP: ffff880204ed1bb8 R08: 0000000000000000 R09:
0000000000000000
[  553.110487] R10: ffff88022f2d9a04 R11: ffff880233001600 R12:
0000000000000000
[  553.110487] R13: ffff880204303c00 R14: ffff8802293d0000 R15:
ffff880202158000
[  553.110487] FS:  00007f31b31fe740(0000) GS:ffff88023bc00000(0000)
knlGS:0000000000000000
[  553.110487] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  553.110487] CR2: 000000379980e3e0 CR3: 000000020d225000 CR4:
00000000000407f0
[  553.110487] Stack:
[  553.110487]  ffff880204ed1ca8 ffffffffa068d7fc 0000000000000000
0000000000000000
[  553.110487]  0000000000000000 ffff8802293d0000 ffff880202158000
ffffffff81cb7900
[  553.110487]  0000000000000000 0000400000001c68 ffff8802086f5a40
000000000000000f
[  553.110487] Call Trace:
[  553.110487]  [<ffffffffa068d7fc>] sctp_sendmsg+0x6bc/0xc80 [sctp]
[  553.110487]  [<ffffffff8128f185>] ? sock_has_perm+0x75/0x90
[  553.110487]  [<ffffffff815a3593>] inet_sendmsg+0x63/0xb0
[  553.110487]  [<ffffffff8128f2b3>] ? selinux_socket_sendmsg+0x23/0x30
[  553.110487]  [<ffffffff8151c5d6>] sock_sendmsg+0xa6/0xd0
[  553.110487]  [<ffffffff81637b05>] ? _raw_spin_unlock_bh+0x15/0x20
[  553.110487]  [<ffffffff8151cd38>] SYSC_sendto+0x128/0x180
[  553.110487]  [<ffffffff8151ce6b>] ? SYSC_connect+0xdb/0x100
[  553.110487]  [<ffffffffa0690031>] ? sctp_inet_listen+0x71/0x1f0
[sctp]
[  553.110487]  [<ffffffff8151d35e>] SyS_sendto+0xe/0x10
[  553.110487]  [<ffffffff81640202>] system_call_fastpath+0x16/0x1b
[  553.110487] Code: e0 48 c7 c7 00 22 6a a0 e8 67 a3 f0 e0 48 c7 [...]
[  553.110487] RIP  [<ffffffffa0698017>] skb_orphan.part.9+0x4/0x6
[sctp]
[  553.110487]  RSP <ffff880204ed1bb8>
[  553.121578] ---[ end trace 46c20c5903ef5be2 ]---

The approach taken here is to split data and control chunks
creation a  bit.  Data chunks already have memory accounting
so noting needs to happen.  For control chunks, add stubs handlers.

Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
struct memcg_cache_params has a union.  Different parts of this union
are used for root and non-root caches.  A part with destroying work is
used only for non-root caches.

I fixed the same problem in another place v3.9-rc1-16204-gf101a94, but
didn't notice this one.

This patch fixes the kernel panic:

[   46.848187] BUG: unable to handle kernel paging request at 000000fffffffeb8
[   46.849026] IP: [<ffffffff811a484c>] kmem_cache_destroy_memcg_children+0x6c/0xc0
[   46.849092] PGD 0
[   46.849092] Oops: 0000 [rabeeh#1] SMP
...

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Glauber Costa <glommer@openvz.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@vger.kernel.org>    [3.9.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants