Tags

Tags give the ability to mark specific points in history as being important
  • 1.0.1-kokkos

    Version 1.0.0 (Kokkos implementation)
    
    ADDED
    
    - Correct Pepper release paper Inspire TexKey as used in the
      `pepper_citations.tex` file written out after a run
    - Summary of the most important switches after the configuration
    - New logo
    - Detailed build instructions and bootstrap scripts; this was in particular
      lacking before for the Kokkos variant
    
    FIXED
    
    - Remove accidental writeout of a `vegas.txt` file into the working directory
    - Fix typos in manpage and command line help output
    - Add missing colour factors in the downloaded process data
    
    CHANGED
    
    - Simplify process specification (#34)
        - The `process_file` setting is deprecated. It can still be used, but is
          now equivalent to the process setting. This also means that process
          files can now also be specified on the cmd line using `-p|--process`.
        - It is still possible to directly specify the process filename, e.g.
          "z2j.csv". However, it is now also possible, and preferred, to give a
          specification that follows the OpenLoops conventions. For the above
          example, this would be "ppzjj", which is then parsed and transformed
          to the correct Pepper filename before trying to read the file.
        - There is no longer a default process.
    - Change default of `phasespace.optimisation.n_steps` to 7 and
      `phasespace.integration.n_nonzero_min` to 45000
    - Use CMake's FetchContent method instead of git submodules to fetch external
      codes during the configuration (#36). This allows us to generate working tarballs.
  • 1.0.1-native

    Version 1.0.0 (native implementation)
    
    ADDED
    
    - Correct Pepper release paper Inspire TexKey as used in the
      `pepper_citations.tex` file written out after a run
    - Summary of the most important switches after the configuration
    - New logo
    - Detailed build instructions and bootstrap scripts; this was in particular
      lacking before for the Kokkos variant
    
    FIXED
    
    - Remove accidental writeout of a `vegas.txt` file into the working directory
    - Fix typos in manpage and command line help output
    - Add missing colour factors in the downloaded process data
    
    CHANGED
    
    - Simplify process specification (#34)
        - The `process_file` setting is deprecated. It can still be used, but is
          now equivalent to the process setting. This also means that process
          files can now also be specified on the cmd line using `-p|--process`.
        - It is still possible to directly specify the process filename, e.g.
          "z2j.csv". However, it is now also possible, and preferred, to give a
          specification that follows the OpenLoops conventions. For the above
          example, this would be "ppzjj", which is then parsed and transformed
          to the correct Pepper filename before trying to read the file.
        - There is no longer a default process.
    - Change default of `phasespace.optimisation.n_steps` to 7 and
      `phasespace.integration.n_nonzero_min` to 45000
    - Use CMake's FetchContent method instead of git submodules to fetch external
      codes during the configuration (#36). This allows us to generate working tarballs.
  • 1.0.0-kokkos

    Version 1.0.0 (Kokkos implementation)
    
    This is the first public release of Pepper for the Kokkos variant
    that targets CPU and GPU by different vendors using the Kokkos
    portability framework.
  • 1.0.0-native

    bc41c1b0 · Merge branch 'dev' ·
    Version 1.0.0 (native implementation)
    
    This is the first public release of Pepper for the non-Kokkos variant
    that targets CPU and Nvidia GPU (via CUDA) natively.
    
    A description of changes with respect to 0.0.3 will be given in a later
    commit.
  • 0.0.3

    Version 0.0.3
    
    This is the first version which is fully usable to generate complete
    input for Sherpa and Pythia. It should therefore be close to a public
    release now, barring of course some optimisation improvements and
    bugfixes that might become necessary towards working to the release.
    
    NEW FEATURES
    
    - Add support for using host-only Chili when otherwise running on device
    - Add beam energy setting
    - Make HDF5 output compatible with the Sherpa reader
    - Full PDF and beam information in HepMC3 output
    - Cross section information in HDF5 output (in the "/generatedResult"
      field)
    - Add support for LHEF event output
    - Add leading colour configuration information to event output
    - Add VEGAS-optimised t-channel (plus one s-channel) integrator
      "Chili(mild)" with CUDA support
    - Add support for CUDA LHAPDF
    
    MINOR IMPROVEMENTS
    
    - Do MPI sum before updating helicity selection weights
    - Enable Chili max eta cuts for jets
    - Only stop optimisation/integration when all processes have been
      sampled at least once
    - Add environment variable PEPPER_DEVICE_ID for setting the CUDA device
      used
    - Scale min. number of nonzero events per optimisation/integration step
      with the number of helicity configuration and divide by the number of
      MPI ranks for easier usage
    - Make CMake configuration output a bit more verbose
    - Support nested internal timing diagnostics
    - Add H_Tp^2/2 and H_T^2/2 scale definitions
    - Improve output when zero events are requested
    
    ADDED DOCUMENTATION
    
    - Manual guide on reusing cached results
    - Manual tutorial "Getting started"
    - Manual reference on Runcard options
    
    PERFORMANCE IMPROVEMENTS
    
    - Add CPU vector instructions for some vertices
    - Improve performance of resetting particle information in the recursion
    - Remove unnecessary D->H copies of particle information
    - Remove redundant helicity selection weight updates
    - Make H->D copy of random numbers for FORCE_HOST_RNG=1 faster
    - Evaluate Z/photon currents simultaneously
    - Improve performance of the momentum storage handling in the Chili
      interface
    - Use minimal storage for matrix elements
    - Cache the non-zero helicity configuration, which speeds up
      initialisation of subsequent runs in particular on the device
    
    BUGFIXES
    
    - Fix read-in of selection weights
    - Fix too low output precision of cached results
    - Fix bug when returning a zero standard deviation/variance if the
      number of trials is one
    - Fix non positive definite standard deviation (and hence selection
      weights)
    - Use correct scale information in HDF5 output
    - Improve heuristics to set beams in HDF5 output
    - Increase FORM max. term number to fix colour factor generation bug
    - Fix moving FORM-generated files to a across filesystem
    - Fix crash in HDF5 output for weighted events
    - Fill dummy cross sections for auxiliary weights to suppress new HepMC3
      warnings
    - Fix integer overflow bug when MPI summing MC event counters for many
      ranks
    - Remove dummy-event zero counting in HDF5 output, which broke the
      Sherpa readin
    - Fix crash in HDF5 output when all events are zero
    - Correct reported LHEH5 version string from 2.0.0 to 2.0.1
    - Fix GPU device selection for MPI use
    - Fix accidental correlation in the random flavour channel selection
    - Fill correct scales and couplings in various output formats
  • no_process_sampling_during_optimisation

    A version where procs are looped during optimisation
    
    ... instead of sampled. They are however not summed, but just looped
    over. In any case, it has turned out to be very slowly converging only,
    because hardly contributing processes are evaluated on an equal footing
    with any other process.
  • multiple_perform_kernels_and_vcl

    A version with many recursion kernels and VCL support
    
    This is very similar to the multiple_perform_kernels tag, but adds VCL
    support on top. It should perform very well on CPU with vector
    instructions, possibly better than later versions with a unified
    recursion kernel.
  • 0.0.2

    Version 0.0.2
    
    This is the first version capable of dynamic scale setting, which should
    allow us to generate practically useful results now.
    
    Important bugfixes include the calculation of phi (which affected the
    application of cuts and thus physics results), fixing the unweighting
    for processes with more than process flavour group, and properly
    enabling the cache of Chili results (which also includes a fix on the
    Chili side).
  • 0.0.1

    Version 0.0.1
    
    This is the first version capable of generating physical (parton-level)
    LHC events, with the last building block being the proper symmetrisation
    of the initial state.
  • 0.0.0

    Version 0.0.0
    
    This pre-release tag points to the very first commit in the Pepper
    repository. At that point, the working title of the project was still
    BlockGen 2.
  • no_helicity_blocks

    af6ee081 · Fix gcc compiler warnings ·
    A version where an entire event block has the same sampled helicity
    
    Later, we switched to introducing smaller helicity blocks
    (helicity_block_size <= block_size), e.g. of 32 events, that share a
    single helicity. This gives better helicity sampling statistics, and is
    still keeping lock-step for CUDA warps (of 32 threads).
    
    Note that helicity summing is always the same and is not affected.
  • multiple_perform_kernels

    A version where the BG recursion still uses many kernels
    
    ... e.g. for each vertex and propagator, instead of merging them into a
    single "perform" kernel, which is more appropriate for CUDA (because
    starting too many small kernels gives too much overhead). However, for
    non-CUDA performance on the CPU, this versions with small kernels performs
    better, presumably because looping over events with small kernels gives
    better caching efficiency there.