1. 08 Mar, 2016 2 commits
    • Andreas Hansson's avatar
      configs: Add a lat_mem_rd style test script · 25ff96a4
      Andreas Hansson authored
      This patch adds a config script that broadly replicates the behaviour
      of lat_mem_rd. The test is based on traffic generators, and as such we
      simply randomise addresses in increasingly large ranges, and play them
      back using the trace functionality of the traffic generator.
      
      The test script is accompanied by a post-processing and visualisation
      script. At the moment no configurability is added to tweak the memory
      hierarchy, but a follow on patch could easily extend the
      functionality.
      25ff96a4
    • Andreas Hansson's avatar
      syscall_emul: Fix erroneous use of delete · 8ed47851
      Andreas Hansson authored
      clang correctly points out an erroneous use of delete.
      8ed47851
  2. 17 Jun, 2015 1 commit
    • David Guillen Fandos's avatar
      sim: Add voltage() function to clocked_object · 7e9c19f3
      David Guillen Fandos authored
      Adding voltage function which returns the current voltage
      for a given clocked object. It's handy for power models and
      similar stuff that need to retrieve voltage. Function
      frequency() is already there, so I see no reason for not having
      this one too.
      7e9c19f3
  3. 05 May, 2015 1 commit
    • Rekai Gonzalez Alberquilla's avatar
      cpu: Change literal integer constants to meaningful labels · cf96bb07
      Rekai Gonzalez Alberquilla authored
      fu_pool and inst_queue were using -1 for "no such FU" and -2 for "all those
      FUs are busy at the moment" when requesting for a FU and replying. This
      patch introduces new constants NoCapableFU and NoFreeFU respectively.
      
      In addition, the condition (idx == -2 || idx != -1) is equivalent to
      (idx != -1), so this patch also simplifies that.
      cf96bb07
  4. 04 Mar, 2016 1 commit
  5. 27 Nov, 2015 1 commit
  6. 26 Nov, 2015 3 commits
  7. 27 Nov, 2015 1 commit
    • Andreas Sandberg's avatar
      base: Add support for changing output directories · b03e83ac
      Andreas Sandberg authored
      
      
      This changeset adds support for changing the simulator output
      directory. This can be useful when the simulation goes through several
      stages (e.g., a warming phase, a simulation phase, and a verification
      phase) since it allows the output from each stage to be located in a
      different directory. Relocation is done by calling core.setOutputDir()
      from Python or simout.setOutputDirectory() from C++.
      
      This change affects several parts of the design of the gem5's output
      subsystem. First, files returned by an OutputDirectory instance (e.g.,
      simout) are of the type OutputStream instead of a std::ostream. This
      allows us to do some more book keeping and control re-opening of files
      when the output directory is changed. Second, new subdirectories are
      OutputDirectory instances, which should be used to create files in
      that sub-directory.
      
      Signed-off-by: default avatarAndreas Sandberg <andreas@sandberg.pp.se>
      [sascha.bischoff@arm.com: Rebased patches onto a newer gem5 version]
      Signed-off-by: default avatarSascha Bischoff <sascha.bischoff@arm.com>
      Signed-off-by: default avatarAndreas Sandberg <andreas.sandberg@arm.com>
      b03e83ac
  8. 29 Feb, 2016 1 commit
  9. 10 Aug, 2015 1 commit
    • Stephan Diestelhorst's avatar
      mem, cpu: Add assertions to snoop invalidation logic · b437f9a0
      Stephan Diestelhorst authored
      This patch adds assertions that enforce that only invalidating snoops
      will ever reach into the logic that tracks in-order load completion and
      also invalidation of LL/SC (and MONITOR / MWAIT) monitors. Also adds
      some comments to MSHR::replaceUpgrades().
      b437f9a0
  10. 19 Jul, 2015 1 commit
    • Krishnendra Nathella's avatar
      cpu: Fix LLSC atomic CPU wakeup · be72bd83
      Krishnendra Nathella authored
      Writes to locked memory addresses (LLSC) did not wake up the locking
      CPU. This can lead to deadlocks on multi-core runs. In AtomicSimpleCPU,
      recvAtomicSnoop was checking if the incoming packet was an invalidation
      (isInvalidate) and only then handled a locked snoop. But, writes are
      seen instead of invalidates when running without caches (fast-forward
      configurations). As as simple fix, now handleLockedSnoop is also called
      even if the incoming snoop packet are from writes.
      be72bd83
  11. 29 Feb, 2016 2 commits
  12. 24 Feb, 2016 3 commits
  13. 23 Feb, 2016 4 commits
    • Andreas Sandberg's avatar
      dev, arm: Implement the NoMali reset callback · 0eb96a56
      Andreas Sandberg authored
      Add a callback handler for the NoMali reset callback. This callback is
      called whenever the GPU is reset using the register interface or the
      NoMali API. The callback can be used to override ID registers using
      the raw register API.
      0eb96a56
    • Andreas Sandberg's avatar
      dev, arm: Refactor the NoMali GPU · 3a84fa47
      Andreas Sandberg authored
      Refactor and cleanup the NoMaliGpu class:
      
        * Use a std::map instead of a switch block to map the parameter enum
          describing the GPU type to a NoMali type.
      
        * Remove redundant NoMali handle from the interrupt callback.
      
        * Make callbacks and API wrappers protected instead of private to
          enable future extensions.
      
        * Wrap remaining NoMali API calls.
      3a84fa47
    • Andreas Sandberg's avatar
      arm: Ship Linux device trees with gem5 · e162ddb3
      Andreas Sandberg authored
      Ship aarch32 and aarch64 device trees with gem5. We currently ship
      device trees as a part of the gem5 Linux kernel repository. This makes
      tracking hard since device trees are supposed to be platform dependent
      rather than kernel dependent (Linux considers device trees to be a
      stable kernel ABI). It also makes code sharing between aarch32 and
      aarch64 impossible.
      
      This changeset implements a set of device trees for the new
      VExpress_GEM5_V1 platform. The platform is described in a shared file
      that is separate from the memory/CPU description. Due to differences
      in how secondary CPUs are initialized, aarch32 and aarch64 use
      different base files describing CPU nodes and the machine's
      compatibility property.
      e162ddb3
    • Andreas Hansson's avatar
      scons: Add missing override to appease clang · bcacc650
      Andreas Hansson authored
      Make clang happy...again.
      bcacc650
  14. 18 Feb, 2016 2 commits
  15. 17 Feb, 2016 3 commits
  16. 15 Feb, 2016 2 commits
  17. 14 Feb, 2016 1 commit
    • Michael LeBeane's avatar
      ruby: make DMASequencer inherit from RubyPort · b06bc0e7
      Michael LeBeane authored
      This patch essentially rolls back 10518:30e3715c9405 to make RubyPort the
      parent class of DMASequencer.  It removes redundant code and restores some
      features which were lost when directly inheriting from MemObject.  For
      example,
      DMASequencer can now communicate to other devices using PIO, which is useful
      for memmory-mapped communication between multiple DMADevices.
      b06bc0e7
  18. 13 Feb, 2016 2 commits
    • Michael LeBeane's avatar
      configs: add command-line option to stop debug output · 99f78361
      Michael LeBeane authored
      This patch adds a --debug-end flag to main.py so that debug output can be
      stoped at a specified tick, while allowing the simulation to continue. It is
      useful in situations where you would like to produce a trace for a region of
      interest while still collecting stats for the entire run. This is in contrast
      to the currently existing --debug-break flag, which terminates the simulation
      at the tick.
      99f78361
    • Michael LeBeane's avatar
      syscall_emul: Implement clock_getres() system call · 276b059e
      Michael LeBeane authored
      This patch implements the clock_getres() system call for arm and x86 in linux
      SE mode.
      276b059e
  19. 10 Feb, 2016 6 commits
    • Andreas Hansson's avatar
    • Andreas Hansson's avatar
      mem: Be less conservative in clearing load locks in the cache · a5b7d390
      Andreas Hansson authored
      Avoid being overly conservative in clearing load locks in the cache,
      and allow writes to the line if they are from the same context. This
      is in line with ALPHA and ARM.
      a5b7d390
    • Andreas Hansson's avatar
      mem: Move the point of coherency to the coherent crossbar · 3add33e4
      Andreas Hansson authored
      This patch introduces the ability of making the coherent crossbar the
      point of coherency. If so, the crossbar does not forward packets where
      a cache with ownership has already committed to responding, and also
      does not forward any coherency-related packets that are not intended
      for a downstream memory controller. Thus, invalidations and upgrades
      are turned around in the crossbar, and the memory controller only sees
      normal reads and writes.
      
      In addition this patch moves the express snoop promotion of a packet
      to the crossbar, thus allowing the downstream cache to check the
      express snoop flag (as it should) for bypassing any blocking, rather
      than relying on whether a cache is responding or not.
      3add33e4
    • Andreas Hansson's avatar
      mem: Align cache behaviour in atomic when upstream is responding · 567454c6
      Andreas Hansson authored
      Adopt the same flow as in timing mode, where the caches on the path to
      memory get to keep the line (if present), and we use the
      responderHadWritable flag to determine if we need to forward the
      (invalidating) packet or not.
      567454c6
    • Andreas Hansson's avatar
      mem: Align how snoops are handled when hitting writebacks · a3306e18
      Andreas Hansson authored
      This patch unifies the snoop handling in case of hitting writebacks
      with how we handle snoops hitting in the tags. As a result, we end up
      using the same optimisation as the normal snoops, where we inform the
      downstream cache if we encounter a line in Modified (writable and
      dirty) state, which enables us to avoid sending out express snoops to
      invalidate any Shared copies of the line. A few regressions
      consequently change, as some transactions are sunk higher up in the
      cache hierarchy.
      a3306e18
    • Andreas Hansson's avatar
      mem: Deduce if cache should forward snoops · f8a0f6ab
      Andreas Hansson authored
      This patch changes how the cache determines if snoops should be
      forwarded from the memory side to the CPU side. Instead of having a
      parameter, the cache now looks at the port connected on the CPU side,
      and if it is a snooping port, then snoops are forwarded. Less error
      prone, and less parameters to worry about.
      
      The patch also tidies up the CPU classes to ensure that their I-side
      port is not snooping by removing overrides to the snoop request
      handler, such that snoop requests will panic via the default
      MasterPort implement
      f8a0f6ab
  20. 08 Feb, 2016 1 commit
    • Curtis Dunham's avatar
      scons: always generate sim/tags.cc · e21bc393
      Curtis Dunham authored
      Due to insufficient build deps, the checkpoint tags might not get
      updated; this commit solves this. Due to the uncommon nature of the
      build target, regenerating tags.cc is a fairly clean solution. Since
      SCons hashes file contents, it won't recompile anything unless a new
      checkpoint upgrader is actually added.
      e21bc393
  21. 06 Feb, 2016 1 commit
    • Alexandru Dutu's avatar
      x86: revamp cmpxchg8b/cmpxchg16b implementation · cdaba1bb
      Alexandru Dutu authored
      The previous implementation did a pair of nested RMW operations,
      which isn't compatible with the way that locked RMW operations are
      implemented in the cache models.  It was convenient though in that
      it didn't require any new micro-ops, and supported cmpxchg16b using
      64-bit memory ops.  It also worked in AtomicSimpleCPU where
      atomicity was guaranteed by the core and not by the memory system.
      It did not work with timing CPU models though.
      
      This new implementation defines new 'split' load and store micro-ops
      which allow a single memory operation to use a pair of registers as
      the source or destination, then uses a single ldsplit/stsplit RMW
      pair to implement cmpxchg.  This patch requires support for 128-bit
      memory accesses in the ISA (added via a separate patch) to support
      cmpxchg16b.
      cdaba1bb