Using SModelS¶

SModelS can take SLHA or LHE files as input (see Basic Input). It ships with a command-line tool runSModelS.py, which reports on the SMS decomposition and theory predictions in several output formats.

For users more familiar with Python and the SModelS basics, an example code Example.py is provided showing how to access the main SModelS functionalities: decomposition, the database and computation of theory predictions.

The command-line tool (runSModelS.py) and the example Python code (Example.py) are described below.

Note

For non-MSSM (incl. non-SUSY) input models the user needs to write their own model.py file and specify which BSM particles are even or odd under the assumed Z₂ symmetry (see adding new particles). From version 1.2.0 onwards it is also necessary to define the BSM particle quantum numbers in the same file [1].

runSModelS.py¶

runSModelS.py covers several different applications of the SModelS functionality, with the option of turning various features on or off, as well as setting the basic parameters. These functionalities include detailed checks of input SLHA files, running the decomposition, evaluating the theory predictions and comparing them to the experimental limits available in the database, determining missing topologies and printing the output in several available formats.

Starting on v1.1, runSModelS.py is equipped with two additional functionalities. First, it can process a folder containing a set of SLHA or LHE file, second, it supports parallelization of this input folder.

The usage of runSModelS is:

runSModelS.py [-h] -f FILENAME [-p PARAMETERFILE] [-o OUTPUTDIR] [-d] [-t] [-C] [-V] [-c] [-v VERBOSE] [-T TIMEOUT]

arguments:

`-h, --help`	show this help message and exit
`-f FILENAME, --filename FILENAME`
	name of SLHA or LHE input file or a directory path (required argument). If a directory is given, loop over all files in the directory
`-p PARAMETERFILE, --parameterFile PARAMETERFILE`
	name of parameter file, where most options are defined (optional argument). If not set, use all parameters from smodels/etc/parameters_default.ini
`-o OUTPUTDIR, --outputDir OUTPUTDIR`
	name of output directory (optional argument). The default folder is: ./results/
`-d, --development`
	if set, SModelS will run in development mode and exit if any errors are found.
`-t, --force_txt`
	force loading the text database
`-C, --colors`	colored output
`-V, --version`	show program’s version number and exit
`-c, --run-crashreport`
	parse crash report file and use its contents for a SModelS run. Supply the crash file simply via ‘– filename myfile.crash’
`-v VERBOSE, --verbose VERBOSE`
	sets the verbosity level (debug, info, warning, error). Default value is info.
`-T TIMEOUT, --timeout TIMEOUT`
	define a limit on the running time (in secs).If not set, run without a time limit. If a directory is given as input, the timeout will be applied for each individual file.

A typical usage example is:

runSModelS.py -f inputFiles/slha/simplyGluino.slha -p parameters.ini -o ./ -v warning

The resulting output will be generated in the current folder, according to the printer options set in the parameters file.

The Parameters File¶

The basic options and parameters used by runSModelS.py are defined in the parameters file. An example parameter file, including all available parameters together with a short description, is stored in parameters.ini. If no parameter file is specified, the default parameters stored in smodels/etc/parameters_default.ini are used. Below we give more detailed information about each entry in the parameters file.

options: main options for turning SModelS features on or off

checkInput (True/False): if True, runSModelS.py will run the file check tool on the input file and verify if the input contains all the necessary information.

doInvisible (True/False): turns invisible compression on or off during the decomposition.

doCompress (True/False): turns mass compression on or off during the decomposition.

computeStatistics (True/False): turns the likelihood and \(\chi^2\) computation on or off (see likelihood calculation). If True, the likelihood and \(\chi^2\) values are computed for the EM-type results.

testCoverage (True/False): set to True to run the coverage tool.

combineSRs (True/False): set to True to use, whenever available, covariance matrices to combine signal regions. NB this might take a few secs per point. Set to False to use only the most sensitive signal region (faster!). Available v1.1.3 onwards.

particles: defines the particle content of the BSM model

model: pathname to the Python file that defines the particle content of the BSM model, given either in Unix file notation (“/path/to/model.py”) or as Python module path (“path.to.model”). Defaults to share.models.mssm which is a standard MSSM. See smodels/share/models folder for more examples. Directory name can be omitted; in that case, the current working directory as well as smodels/share/models are searched for.

parameters: basic parameter values for running SModelS

sigmacut (float): minimum value for an element weight (in fb). Elements with a weight below sigmacut are neglected during the decomposition of SLHA files (see Minimum Decomposition Weight). The default value is 0.03 fb. Note that, depending on the input model, the running time may increase considerably if sigmacut is too low, while too large values might eliminate relevant elements.

minmassgap (float): maximum value of the mass difference (in GeV) for perfoming mass compression. Only used if doCompress = True

maxcond (float): maximum allowed value (in the [0,1] interval) for the violation of upper limit conditions. A zero value means the conditions are strictly enforced, while 1 means the conditions are never enforced. Only relevant for printing the output summary.

ncpus (int): number of CPUs. When processing multiple SLHA/LHE files, SModelS can run in a parallelized fashion, splitting up the input files in equal chunks. ncpus = -1 parallelizes to as many processes as number of CPU cores of the machine. Default value is 1. Warning: python already parallelizes many tasks internally.

database: allows for selection of a subset of experimental results from the database

path: the absolute (or relative) path to the database. The user can supply either the directory name of the database, or the path to the pickle file. Also http addresses may be given, e.g. http://smodels.hephy.at/database/official113. The path “official” refers to the official database of your SModelS version – without fastlim; “official_fastlim” includes fastlim results. See the github database release page for a list of public database versions.

analyses (list of results): set to [‘all’] to use all available results. If a list of experimental analyses is given, only these will be used. For instance, setting analyses = CMS-PAS-SUS-13-008,ATLAS-CONF-2013-024 will only use the experimental results from CMS-PAS-SUS-13-008 and ATLAS-CONF-2013-024. Wildcards (, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. So analyses = CMS leads to evaluation of results from the CMS-experiment only, for example. SUS selects everything containining SUS, no matter if from CMS or ATLAS. Furthermore selection of analyses can be confined on their centre-of-mass energy with a suffix beginning with a colon and an energy string in unum-style, like :13*TeV. Note that the asterisk behind the colon is not a wildcard. :13, :13TeV and :13 TeV are also understood but discouraged.

txnames (list of topologies): set to [‘all’] to use all available simplified model topologies. The topologies are labeled according to the txname convention. If a list of txnames are given, only the corresponding topologies will be considered. For instance, setting txnames = T2 will only consider experimental results for \(pp \to \tilde{q} + \tilde{q} \to (jet+\tilde{\chi}_1^0) + (jet+\tilde{\chi}_1^0)\) and the output will only contain constraints for this topology. A list of all topologies and their corresponding txnames can be found here Wildcards (*, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. So, for example, txnames = T[12]*bb* picks all txnames beginning with T1 or T2 and containg bb as of the time of writing were: T1bbbb, T1bbbt, T1bbqq, T1bbtt, T2bb, T2bbWW, T2bbWWoff

dataselector (list of datasets): set to [‘all’] to use all available data sets. If dataselector = upperLimit (efficiencyMap), only UL-type results (EM-type results) will be used. Furthermore, if a list of signal regions (data sets) is given, only the experimental results containing these datasets will be used. For instance, if dataselector = SRA mCT150,SRA mCT200, only these signal regions will be used. Wildcards (*, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. Wildcard examples are given above.

dataTypes dataType of the analysis (all, efficiencyMap or upperLimit). Can be wildcarded with usual shell wildcards: * ? [<list-of-or’ed-letters>]. Wildcard examples are given above.

printer: main options for the output format

outputType (list of outputs): use to list all the output formats to be generated. Available output formats are: summary, stdout, log, python, xml, slha.

stdout-printer: options for the stdout or log printer

printDatabase (True/False): set to True to print the list of selected experimental results to stdout.

addAnaInfo (True/False): set to True to include detailed information about the txnames tested by each experimental result. Only used if printDatabase=True.

printDecomp (True/False): set to True to print basic information from the decomposition (topologies, total weights, …).

addElementInfo (True/False): set to True to include detailed information about the elements generated by the decomposition. Only used if printDecomp=True.

printExtendedResults (True/False): set to True to print extended information about the theory predictions, including the PIDs of the particles contributing to the predicted cross section, their masses and the expected upper limit (if available).

addCoverageID (True/False): set to True to print the list of element IDs contributing to each missing topology (see coverage). Only used if testCoverage = True. This option should be used along with addElementInfo = True so the user can precisely identify which elements were classified as missing.

summary-printer: options for the summary printer

expandedSummary (True/False): set True to include in the summary output all applicable experimental results, False for only the strongest one.

python-printer: options for the Python printer

addElementList (True/False): set True to include in the Python output all information about all elements generated in the decomposition. If set to True the output file can be quite large.

addTxWeights (True/False): set True to print the contribution from individual topologies to each theory prediction. Available v1.1.3 onwards.

xml-printer: options for the xml printer

addElementList (True/False): set True to include in the xml output all information about all elements generated in the decomposition. If set to True the output file can be quite large.

addTxWeights (True/False): set True to print the contribution from individual topologies to each theory prediction. Available v1.1.3 onwards.

The Output¶

The results of runSModelS.py are printed to the format(s) specified by the outputType in the parameters file. The following formats are available:

a human-readable screen output (stdout) or log output. These are intended to provide detailed information about the database, the decomposition, the theory predictions and the missing topologies. The output complexity can be controlled through several options in the parameters file. Due to its size, this output is not suitable for storing the results from a large scan, being more appropriate for a single file input.

a human-readable text file output containing a summary of the output. This format contains the main SModelS results: the theory predictions and the missing topologies. It can be used for a large scan, since the output can be made quite compact, using the options in the parameters file.

a python dictionary printed to a file containing information about the decomposition, the theory predictions and the missing topologies. The output can be significantly long, if all options in the parameters file are set to True. However this output can be easily imported to a Python enviroment, making it easy to access the desired information. For users familiar with the Python language this is the recommended format.

a xml file containing information about the decomposition, the theory predictions and the missing topologies. The output can be significantly long, if all options are set to True. Due to its broad usage, the xml output can be easily converted to the user’s preferred format.

a SLHA file containing information about the theory predictions and the missing topologies. The output follows a SLHA-type format and contains a summary of the most constraining results and the missed topologies.

A detailed explanation of the information contained in each type of output is given in SModels Output.

Example.py¶

Although runSModelS.py provides the main SModelS features with a command line interface, users more familiar with Python and the SModelS language may prefer to write their own main program. A simple example code for this purpose is provided in examples/Example.py. Below we go step-by-step through this example code:

Import the SModelS modules and methods. If the example code file is not located in the smodels installation folder, simply add “sys.path.append(<smodels installation path>)” before importing smodels. Set SModelS verbosity level.

from smodels import particlesLoader
from smodels.theory import slhaDecomposer,lheDecomposer
from smodels.tools.physicsUnits import fb, GeV, TeV
from smodels.theory.theoryPrediction import theoryPredictionsFor
from smodels.experiment.databaseObj import Database
from smodels.tools import coverage
from smodels.tools.smodelsLogging import setLogLevel
setLogLevel("info")

Set the path to the database URL. Specify which database to use. It can be the path to the smodels-database folder, the path to a pickle file or (starting with v1.1.3) a URL path.

# Set the path to the database
database = Database("official")

Define the input model. By default SModelS assumes the MSSM particle content. For using SModelS with a different particle content, the user must define the new particle content and set modelFile to the path of the model file (see particles:model in Parameter File).

    #Define your model (list of rEven and rOdd particles)
    particlesLoader.load( 'smodels.share.models.mssm' ) #Make sure all the model particles are up-to-date

Path to the input file. Specify the location of the input file. It must be a SLHA or LHE file (see Basic Input).

    slhafile = 'inputFiles/slha/lightEWinos.slha'
    lhefile = 'inputFiles/lhe/gluino_squarks.lhe'

Set main options for decomposition. Specify the values of sigmacut and minmassgap:

    sigmacut = 0.01 * fb
    mingap = 5. * GeV

Decompose model. Depending on the type of input format, choose either the slhaDecomposer.decompose or lheDecomposer.decompose method. The doCompress and doInvisible options turn the mass compression and invisible compression on/off.

    # Decompose model (use slhaDecomposer for SLHA input or lheDecomposer for LHE input)
    slhaInput = True
    if slhaInput:
        toplist = slhaDecomposer.decompose(slhafile, sigmacut, doCompress=True, doInvisible=True, minmassgap=mingap)
    else:

Access basic information from decomposition, using the topology list and topology objects:

    # Access basic information from decomposition, using the topology list and topology objects:
    print( "\n Decomposition Results: " )
    print( "\t  Total number of topologies: %i " %len(toplist) )
    nel = sum([len(top.elementList) for top in toplist])
    print( "\t  Total number of elements = %i " %nel )
    #Print information about the m-th topology (if it exists):
    m = 2
    if len(toplist) > m:
        top = toplist[m]
        print( "\t\t %i-th topology  = " %m,top,"with total cross section =",top.getTotalWeight() )
        #Print information about the n-th element in the m-th topology:
        n = 0
        el = top.elementList[n]
        print( "\t\t %i-th element from %i-th topology  = " %(n,m),el, end="" )
        print( "\n\t\t\twith final states =",el.getFinalStates(),"\n\t\t\twith cross section =",el.weight,"\n\t\t\tand masses = ",el.getMasses() )

output:

 Decomposition Results: 
	  Total number of topologies: 51 
	  Total number of elements = 14985 
		 2-th topology  =  [][2] with total cross section = ['8.00E+00 [TeV]:3.05E-01 [pb]', '1.30E+01 [TeV]:5.21E-01 [pb]']
		 0-th element from 2-th topology  =  [[],[[b,b]]]
			with final states = ['MET', 'MET'] 
			with cross section = ['8.00E+00 [TeV]:2.44E-04 [pb]', '1.30E+01 [TeV]:1.17E-03 [pb]'] 
			and masses =  [[6.81E+01 [GeV]], [1.35E+02 [GeV], 6.81E+01 [GeV]]]

Load the experimental results to be used to constrain the input model. Here, all results are used:

    listOfExpRes = database.getExpResults()

Alternatively, the getExpResults method can take as arguments specific results to be loaded.

Print basic information about the results loaded. Below we show how to count the number of UL-type results and EM-type results loaded:

    nUL, nEM = 0, 0
    for exp in listOfExpRes:
        expType = exp.getValuesFor('dataType')[0]
        if expType == 'upperLimit':
            nUL += 1
        elif  expType == 'efficiencyMap':
            nEM += 1
    print( "\n Loaded Database with %i UL results and %i EM results " %(nUL,nEM) )

output:

 Loaded Database with 55 UL results and 21 EM results 

Compute the theory predictions for each experimental result. The output is a list of theory prediction objects (for each experimental result):

    for expResult in listOfExpRes:
        predictions = theoryPredictionsFor(expResult, toplist, combinedResults=False, marginalize=False)

Print the results. For each experimental result, loop over the corresponding theory predictions and print the relevant information:

        for theoryPrediction in predictions:
            dataset = theoryPrediction.dataset
            datasetID = dataset.dataInfo.dataId            
            mass = theoryPrediction.mass
            txnames = [str(txname) for txname in theoryPrediction.txnames]
            PIDs =  theoryPrediction.PIDs         
            print( "------------------------" )
            print( "Dataset = ",datasetID )   #Analysis name
            print( "TxNames = ",txnames )  
            print( "Prediction Mass = ",mass )   #Value for average cluster mass (average mass of the elements in cluster)
            print( "Prediction PIDs = ",PIDs )   #Value for average cluster mass (average mass of the elements in cluster)
            print( "Theory Prediction = ",theoryPrediction.xsection )  #Signal cross section
            print( "Condition Violation = ",theoryPrediction.conditions ) #Condition violation values

output:

 ATLAS-SUSY-2015-06 
------------------------
Dataset =  SR5j
TxNames =  ['T1']
Prediction Mass =  [[5.77E+02 [GeV], 6.81E+01 [GeV]], [5.77E+02 [GeV], 6.81E+01 [GeV]]]
Prediction PIDs =  [[[1000021, 1000022], [1000021, 1000022]]]
Theory Prediction =  1.30E+01 [TeV]:5.19E-06 [pb]
Condition Violation =  {'None': None}

Get the corresponding upper limit. This value can be compared to the theory prediction to decide whether a model is excluded or not:

            print( "UL for theory prediction = ",theoryPrediction.upperLimit )

output:

UL for theory prediction =  1.79E+00 [fb]

Print the r-value, i.e. the ratio theory prediction/upper limit. A value of \(r \geq 1\) means that an experimental result excludes the input model. For EM-type results also compute the \(\chi^2\) and likelihood. Determine the most constraining result:

            print( "r = ",r )
            #Compute likelihhod and chi^2 for EM-type results:
            if dataset.dataInfo.dataType == 'efficiencyMap':
                theoryPrediction.computeStatistics()
                print( 'Chi2, likelihood=', theoryPrediction.chi2, theoryPrediction.likelihood )
            if r > rmax:
                rmax = r
                bestResult = expResult.globalInfo.id
            

output:

r =  0.0029013935307768326
Chi2, likelihood= 2.3776849368423356 0.007169156710956845

Print the most constraining experimental result. Using the largest r-value, determine if the model has been excluded or not by the selected experimental results:

    if rmax > 1.:
        print( "(The input model is likely excluded by %s)" %bestResult )
    else:
        print( "(The input model is not excluded by the simplified model results)" )
      

output:

The largest r-value (theory/upper limit ratio) is  1.2039296443268397
(The input model is likely excluded by CMS-SUS-13-006)

Identify missing topologies. Using the output from decomposition, identify the missing topologies and print some basic information:

    print( "Total cross section where we are outside the mass grid (fb): %10.3E\n" %(uncovered.getOutOfGridXsec()) )
    print( "Total cross section in long cascade decays (fb): %10.3E\n" %(uncovered.getLongCascadeXsec()) )
    print( "Total cross section in decays with asymmetric branches (fb): %10.3E\n" %(uncovered.getAsymmetricXsec()) )
    
    #Print some of the missing topologies:
    print( 'Missing topologies (up to 3):' )
    for topo in uncovered.missingTopos.topos[:3]:
        print( 'Topology:',topo.topo )
        print( 'Contributing elements (up to 2):' )
        for el in topo.contributingElements[:2]:
            print( el,'cross-section (fb):', el.missingX )
    
    #Print elements with long cascade decay:
    print( '\nElements outside the grid (up to 2):' )
    for topo in uncovered.outsideGrid.topos[:2]:
        print( 'Topology:',topo.topo )
        print( 'Contributing elements (up to 4):' )
        for el in topo.contributingElements[:4]:
            print( el,'cross-section (fb):', el.missingX )
            print( '\tmass:',el.getMasses() )
        

output:

Total missing topology cross section (fb):  3.717E+03

Total cross section where we are outside the mass grid (fb):  9.984E+01

Total cross section in long cascade decays (fb):  8.992E+02

Total cross section in decays with asymmetric branches (fb):  2.790E+03

Missing topologies (up to 3):
Topology: [[],[]](MET,MET)
Contributing elements (up to 2):
[[],[]] cross-section (fb): 0.45069449699999997

Elements outside the grid (up to 2):
Topology: [[[W]],[[Z]]](MET,MET)
Contributing elements (up to 4):
[[[W+]],[[Z]]] cross-section (fb): 0.3637955190752928
	mass: [[2.93E+02 [GeV], 6.81E+01 [GeV]], [2.66E+02 [GeV], 6.81E+01 [GeV]]]

It is worth noting that SModelS does not include any statistical treatment for the results, for instance, correction factors like the “look elsewhere effect”. Due to this, the results are claimed to be “likely excluded” in the output.

Notes:

For an SLHA input file, the decays of final states (or Z₂-even particles such as the Higgs, W,…) are always ignored during the decomposition. Furthermore, if there are two cross sections at different calculation order (say LO and NLO) for the same process, only the highest order is used.
The list of elements can be extremely long. Try setting addElementInfo = False and/or printDecomp = False to obtain a smaller output.
A comment of caution is in order regarding naively using the highest \(r\)-value reported by SModelS, as this does not necessarily come from the most sensitive analysis. For a rigorous statistical interpretation, one should use the \(r\)-value of the result with the highest expected \(r\) (\(r_{exp}\)). Unfortunately, for UL-type results, the expected limits are often not available; \(r_{exp}\) is then reported as N/A in the SModelS output.

[1]	We note that SLHA files including decay tables and cross sections, together with the corresponding model.py, can conveniently be generated via the SModelS-micrOMEGAS interface, see arXiv:1606.03834