1. 30 Jan, 2019 1 commit
  2. 25 Jan, 2019 1 commit
  3. 19 Dec, 2018 1 commit
  4. 06 Dec, 2018 1 commit
  5. 26 Jun, 2018 1 commit
    • When all cpus are sleeping (e.g just entered WFE)
      and SystemC is waiting because it is ahead of qemu,
      it can lead to a deadlock where the main_loop is waiting.
      The solution is to wake up the main_loop, so it then warp timers,
      which makes qemu jump to the end of quantum and unlock SystemC.
      Clement Deschamps authored
  6. 13 Jun, 2018 1 commit
  7. 04 Jun, 2018 1 commit
  8. 22 May, 2018 2 commits
  9. 21 May, 2018 2 commits
  10. 18 May, 2018 1 commit
  11. 28 Mar, 2018 1 commit
  12. 08 Mar, 2018 1 commit
  13. 26 Feb, 2018 2 commits
  14. 23 Jan, 2018 1 commit
  15. 11 Jan, 2018 1 commit
  16. 19 Dec, 2017 1 commit
  17. 18 Dec, 2017 2 commits
    • Previously the iothread was started in qbox_end_of_elaboration(), so the
      cpu was starting ahead of SystemC models.
      
      The iothread is now still created in qbox_end_of_elaboration(), but it
      is paused right before the cpu loop starts.
      
      The new function qbox_start_of_simulation(), simply unlocks the iothread.
      
      This commit also fixes a reset issue, as the structure reset_request
      was initialized before smp_cpus was set.
      Clement Deschamps authored
    • Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Michael Roth authored
  18. 15 Dec, 2017 3 commits
    • if KVM is enabled and KVM capabilities MMU radix is available,
      the partition table entry (patb_entry) for the radix mode is
      initialized by default in ppc_spapr_reset().
      
      It's a problem if we want to migrate the guest to a POWER8 host
      while the kernel is not started to set the value to the one
      expected for a POWER8 CPU.
      
      The "-machine max-cpu-compat=power8" should allow to migrate
      a POWER9 KVM host to a POWER8 KVM host, but because patb_entry
      is set, the destination QEMU tries to enable radix mode on the
      POWER8 host. This fails and cancels the migration:
      
          Process table config unsupported by the host
          error while loading state for instance 0x0 of device 'spapr'
          load of migration failed: Invalid argument
      
      This patch doesn't set the PATB entry if the user provides
      a CPU compatibility mode that doesn't support radix mode.
      
      Signed-off-by: Laurent Vivier <lvivier@redhat.com>
      Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
      (cherry picked from commit 1481fe5f)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Laurent Vivier authored
    • The device tree nodes ibm,arch-vec-5-platform-support and ibm,pa-features
      are used to communicate features of the cpu to the guest operating
      system. The properties of each of these are determined based on the
      selected cpu model and the availability of hypervisor features.
      Currently the compatibility mode of the cpu is not taken into account.
      
      The ibm,arch-vec-5-platform-support node is used to communicate the
      level of support for various ISAv3 processor features to the guest
      before CAS to inform the guests' request. The available mmu mode should
      only be hash unless the cpu is a POWER9 which is not in a prePOWER9
      compat mode, in which case the available modes depend on the
      accelerator and the hypervisor capabilities.
      
      The ibm,pa-featues node is used to communicate the level of cpu support
      for various features to the guest os. This should only contain features
      relevant to the operating mode of the processor, that is the selected
      cpu model taking into account any compat mode. This means that the
      compat mode should be taken into account when choosing the properties of
      ibm,pa-features and they should match the compat mode selected, or the
      cpu model selected if no compat mode.
      
      Update the setting of these cpu features in the device tree as described
      above to properly take into account any compat mode. We use the
      ppc_check_compat function which takes into account the current processor
      model and the cpu compat mode.
      
      Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
      (cherry picked from commit 7abd43ba)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Suraj Jitindar Singh authored
    • Commit 8c37faa4 ("vfio-pci, ppc64/spapr: Reorder group-to-container
      attaching") moved registration of groups with the vfio-kvm device from
      vfio_get_group() to vfio_connect_container(), but it missed the case
      where a group is attached to an existing container and takes an early
      exit.  Perhaps this is a less common case on ppc64/spapr, but on x86
      (without viommu) all groups are connected to the same container and
      thus only the first group gets registered with the vfio-kvm device.
      This becomes a problem if we then hot-unplug the devices associated
      with that first group and we end up with KVM being misinformed about
      any vfio connections that might remain.  Fix by including the call to
      vfio_kvm_device_add_group() in this early exit path.
      
      Fixes: 8c37faa4 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching")
      Cc: qemu-stable@nongnu.org # qemu-2.10+
      Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: Peter Xu <peterx@redhat.com>
      Tested-by: Peter Xu <peterx@redhat.com>
      Reviewed-by: Eric Auger <eric.auger@redhat.com>
      Tested-by: Eric Auger <eric.auger@redhat.com>
      Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
      (cherry picked from commit 2016986a)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Alex Williamson authored
  19. 07 Dec, 2017 2 commits
    • The exclusive_addr value is actually not always reset to -1.
      We now compare it to the payload address in order to determine wether the
      access is exclusive or not.
      Clement Deschamps authored
    • At guest reset time, we allocate a hash page table (HPT) for the guest
      based on the guest's RAM size.  If dynamic HPT resizing is not available we
      use the maximum RAM size, if it is we use the current RAM size.
      
      But the "current RAM size" calculation is incorrect - we just use the
      "base" ram_size from the machine structure.  This doesn't include any
      pluggable DIMMs that are already plugged at reset time.
      
      This means that if you try to start a 'pseries' machine with a DIMM
      specified on the command line that's much larger than the "base" RAM size,
      then the guest will get a woefully inadequate HPT.  This can lead to a
      guest freeze during boot as it runs out of HPT space during initial MMU
      setup.
      
      Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: Greg Kurz <groug@kaod.org>
      Tested-by: Greg Kurz <groug@kaod.org>
      (cherry picked from commit 768a20f3)
      *drop dep on 9aa3397f
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      David Gibson authored
  20. 06 Dec, 2017 14 commits
    • Commit "3d90c625 vga: stop passing pointers to vga_draw_line*
      functions" is incomplete.  It doesn't handle the case that the vga
      rendering code tries to create a shared surface, i.e. a pixman image
      backed by vga video memory.  That can not work in case the guest display
      wraps from end of video memory to the start.  So force shadowing in that
      case.  Also adjust the snapshot region calculation.
      
      Can trigger with cirrus only, when programming vbe modes using the bochs
      api (stdvga, also qxl and virtio-vga in vga compat mode) wrap arounds
      can't happen.
      
      Fixes: CVE-2017-13672
      Fixes: 3d90c625
      Cc: P J P <ppandit@redhat.com>
      Reported-by: David Buchanan <d@vidbuchanan.co.uk>
      Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
      Message-id: 20171010141323.14049-3-kraxel@redhat.com
      (cherry picked from commit 28f77de2)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Gerd Hoffmann authored
    • Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
      (cherry picked from commit 362f8117)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Gerd Hoffmann authored
    • The NBD spec says that a server may fail any transmission request
      with ESHUTDOWN when it is apparent that no further request from
      the client can be successfully honored.  The client is supposed
      to then initiate a soft shutdown (wait for all remaining in-flight
      requests to be answered, then send NBD_CMD_DISC).  However, since
      qemu's server never uses ESHUTDOWN errors, this code was mostly
      untested since its introduction in commit b6f5d3b5.
      
      More recently, I learned that nbdkit as the NBD server is able to
      send ESHUTDOWN errors, so I finally tested this code, and noticed
      that our client was special-casing ESHUTDOWN to cause a hard
      shutdown (immediate disconnect, with no NBD_CMD_DISC), but only
      if the server sends this error as a simple reply.  Further
      investigation found that commit d2febedb introduced a regression
      where structured replies behave differently than simple replies -
      but that the structured reply behavior is more in line with the
      spec (even if we still lack code in nbd-client.c to properly quit
      sending further requests).  So this patch reverts the portion of
      b6f5d3b5 that introduced an improper hard-disconnect special-case
      at the lower level, and leaves the future enhancement of a nicer
      soft-disconnect at the higher level for another day.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: Eric Blake <eblake@redhat.com>
      Message-Id: <20171113194857.13933-1-eblake@redhat.com>
      Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      (cherry picked from commit 01b05c66)
       Conflicts:
      	nbd/client.c
      *drop dep on d2febedb
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Eric Blake authored
    • The NBD spec says that clients should not try to write/trim to
      an export advertised as read-only by the server.  But we failed
      to check that, and would allow the block layer to use NBD with
      BDRV_O_RDWR even when the server is read-only, which meant we
      were depending on the server sending a proper EPERM failure for
      various commands, and also exposes a leaky abstraction: using
      qemu-io in read-write mode would succeed on 'w -z 0 0' because
      of local short-circuiting logic, but 'w 0 0' would send a
      request over the wire (where it then depends on the server, and
      fails at least for qemu-nbd but might pass for other NBD
      implementations).
      
      With this patch, a client MUST request read-only mode to access
      a server that is doing a read-only export, or else it will get
      a message like:
      
      can't open device nbd://localhost:10809/foo: request for write access conflicts with read-only export
      
      It is no longer possible to even attempt writes over the wire
      (including the corner case of 0-length writes), because the block
      layer enforces the explicit read-only request; this matches the
      behavior of qcow2 when backed by a read-only POSIX file.
      
      Fix several iotests to comply with the new behavior (since
      qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP,
      default to a read-only export, we must tell blockdev-add/qemu-io to
      set up a read-only client).
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: Eric Blake <eblake@redhat.com>
      Message-Id: <20171108215703.9295-3-eblake@redhat.com>
      Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      (cherry picked from commit 1104d83c)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Eric Blake authored
    • namelen should be here, length is unrelated, and always 0 at this
      point.  Broken in introduction in commit f37708f6, but mostly
      harmless (replying with '' as the name does not violate protocol,
      and does not confuse qemu as the nbd client since our implementation
      does not ask for the name; but might confuse some other client that
      does ask for the name especially if the default export is different
      than the export name being queried).
      
      Adding an assert makes it obvious that we are not skipping any bytes
      in the client's message, as well as making it obvious that we were
      using the wrong variable.
      
      Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      CC: qemu-stable@nongnu.org
      Message-Id: <20171101154204.27146-1-vsementsov@virtuozzo.com>
      [eblake: improve commit message, squash in assert addition]
      Signed-off-by: Eric Blake <eblake@redhat.com>
      
      (cherry picked from commit 46321d6b)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Vladimir Sementsov-Ogievskiy authored
    • Since commit f1f9e6c5 "vhost: adapt vhost_verify_ring_mappings() to
      virtio 1 ring layout", we check the mapping of each part (descriptor
      table, available ring and used ring) of each virtqueue separately.
      
      The checking of a part is done by the vhost_verify_ring_part_mapping()
      function: it returns either 0 on success or a negative errno if the
      part cannot be mapped at the same place.
      
      Unfortunately, the vhost_verify_ring_mappings() function checks its
      return value the other way round. It means that we either:
      - only verify the descriptor table of the first virtqueue, and if it
        is valid we ignore all the other mappings
      - or ignore all broken mappings until we reach a valid one
      
      ie, we only raise an error if all mappings are broken, and we consider
      all mappings are valid otherwise (false success), which is obviously
      wrong.
      
      This patch ensures that vhost_verify_ring_mappings() only returns
      success if ALL mappings are okay.
      
      Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: Greg Kurz <groug@kaod.org>
      Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
      (cherry picked from commit 2fe45ec3)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Greg Kurz authored
    • Introduced in commit f37708f6 (2.10).  The NBD spec says a client
      can request export names up to 4096 bytes in length, even though
      they should not expect success on names longer than 256.  However,
      qemu hard-codes the limit of 256, and fails to filter out a client
      that probes for a longer name; the result is a stack smash that can
      potentially give an attacker arbitrary control over the qemu
      process.
      
      The smash can be easily demonstrated with this client:
      $ qemu-io f raw nbd://localhost:10809/$(printf %3000d 1 | tr ' ' a)
      
      If the qemu NBD server binary (whether the standalone qemu-nbd, or
      the builtin server of QMP nbd-server-start) was compiled with
      -fstack-protector-strong, the ability to exploit the stack smash
      into arbitrary execution is a lot more difficult (but still
      theoretically possible to a determined attacker, perhaps in
      combination with other CVEs).  Still, crashing a running qemu (and
      losing the VM) is bad enough, even if the attacker did not obtain
      full execution control.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: Eric Blake <eblake@redhat.com>
      (cherry picked from commit 51ae4f84)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Eric Blake authored
    • The NBD spec gives us permission to abruptly disconnect on clients
      that send outrageously large option requests, rather than having
      to spend the time reading to the end of the option.  No real
      option request requires that much data anyways; and meanwhile, we
      already have the practice of abruptly dropping the connection on
      any client that sends NBD_CMD_WRITE with a payload larger than 32M.
      
      For comparison, nbdkit drops the connection on any request with
      more than 4096 bytes; however, that limit is probably too low
      (as the NBD spec states an export name can theoretically be up
      to 4096 bytes, which means a valid NBD_OPT_INFO could be even
      longer) - even if qemu doesn't permit exports longer than 256
      bytes.
      
      It could be argued that a malicious client trying to get us to
      read nearly 4G of data on a bad request is a form of denial of
      service.  In particular, if the server requires TLS, but a client
      that does not know the TLS credentials sends any option (other
      than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
      payload of nearly 4G, then the server was keeping the connection
      alive trying to read all the payload, tying up resources that it
      would rather be spending on a client that can get past the TLS
      handshake.  Hence, this warranted a CVE.
      
      Present since at least 2.5 when handling known options, and made
      worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
      to handle unknown options.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: Eric Blake <eblake@redhat.com>
      (cherry picked from commit fdad35ef)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Eric Blake authored
    • Guest state should not be touched if VM is stopped, unfortunately we
      didn't check running state and tried to drain tx queue unconditionally
      in virtio_net_set_status(). A crash was then noticed as a migration
      destination when user type quit after virtqueue state is loaded but
      before region cache is initialized. In this case,
      virtio_net_drop_tx_queue_data() tries to access the uninitialized
      region cache.
      
      Fix this by only dropping tx queue data when vm is running.
      
      Fixes: 283e2c2a ("net: virtio-net discards TX data after link down")
      Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: qemu-stable@nongnu.org
      Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: Jason Wang <jasowang@redhat.com>
      (cherry picked from commit 70e53e6e)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Jason Wang authored
    • DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE) was overflowing ret (int) if
      st.st_size is greater than 1TB.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: Peter Lieven <pl@kamp.de>
      Message-id: 1511798407-31129-1-git-send-email-pl@kamp.de
      Signed-off-by: Max Reitz <mreitz@redhat.com>
      (cherry picked from commit f1a7ff77)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Peter Lieven authored
    • The u-boot sources we ship currently cause problems with unpacking on
      a case-insensitive filesystem due to path conflicts. This has been
      fixed in upstream u-boot via commit 610eec7f, but since it is not
      yet included in an official release we implement this approach as a
      temporary workaround.
      
      Once we move to a u-boot containing commit 610eec7f we should revert
      this patch.
      
      Cc: qemu-stable@nongnu.org
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Richard Henderson <richard.henderson@linaro.org>
      Cc: Thomas Huth <thuth@redhat.com>
      Cc: Peter Maydell <peter.maydell@linaro.org>
      Suggested-by: Richard Henderson <richard.henderson@linaro.org>
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Reviewed-by: Thomas Huth <thuth@redhat.com>
      Message-id: 20171107205201.10207-1-mdroth@linux.vnet.ibm.com
      Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
      (cherry picked from commit d0dead3b)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Michael Roth authored
    • A DRC with a pending unplug request releases its associated device at
      machine reset time.
      
      In the case of LMB, when all DRCs for a DIMM device have been reset,
      the DIMM gets unplugged, causing guest memory to disappear. This may
      be very confusing for anything still using this memory.
      
      This is exactly what happens with vhost backends, and QEMU aborts
      with:
      
      qemu-system-ppc64: used ring relocated for ring 2
      qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
       `r >= 0' failed.
      
      The issue is that each DRC registers a QEMU reset handler, and we
      don't control the order in which these handlers are called (ie,
      a LMB DRC will unplug a DIMM before the virtio device using the
      memory on this DIMM could stop its vhost backend).
      
      To avoid such situations, let's reset DRCs after all devices
      have been reset.
      
      Reported-by: Mallesh N. Koti <mallesh@linux.vnet.ibm.com>
      Signed-off-by: Greg Kurz <groug@kaod.org>
      Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
      Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
      (cherry picked from commit 82512483)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Greg Kurz authored
    • The sPAPR machine isn't clearing up the pending events QTAILQ on
      machine reboot. This allows for unprocessed hotplug/epow events
      to persist in the queue after reset and, when reasserting the IRQs in
      check_exception later on, these will be being processed by the OS.
      
      This patch implements a new function called 'spapr_clear_pending_events'
      that clears up the pending_events QTAILQ. This helper is then called
      inside ppc_spapr_reset to clear up the events queue, preventing
      old/deprecated events from persisting after a reset.
      
      Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
      Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
      (cherry picked from commit 56258174)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Daniel Henrique Barboza authored
    • vhost_virtqueue_stop() gets avail index value from the backend,
      except if the backend is not responding.
      
      It happens when the backend crashes, and in this case, internal
      state of the virtio queue is inconsistent, making packets
      to corrupt the vring state.
      
      With a Linux guest, it results in following error message on
      backend reconnection:
      
      [   22.444905] virtio_net virtio0: output.0:id 0 is not a head!
      [   22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
      [   22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5
      
      Fixes: 283e2c2a ("net: virtio-net discards TX data after link down")
      Cc: qemu-stable@nongnu.org
      Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
      Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
      (cherry picked from commit 2ae39a11)
      Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
      Maxime Coquelin authored