experiment package¶
Submodules¶
experiment.databaseObj module¶
-
class
experiment.databaseObj.
Database
(base=None, force_load=None, discard_zeroes=True, progressbar=False, subpickle=True, combinationsmatrix=None)[source]¶ Bases:
object
Database object. Holds a list of SubDatabases. Delegates all calls to SubDatabases.
Parameters: - base – path to the database, or pickle file (string), or http address. If None, “official”, or “official_fastlim”, use the official database for your code version (including fastlim results, if specified). If “latest”, or “latest_fastlim”, check for the latest database. Multiple databases may be specified using `+’ as a delimiter.
- force_load – force loading the text database (“txt”), or binary database (“pcl”), dont force anything if None
- discard_zeroes – discard txnames with only zeroes as entries.
- progressbar – show a progressbar when building pickle file (needs the python-progressbar module)
- subpickle – produce small pickle files per exp result. Should only be used when working on the database.
- combinationsmatrix – an optional dictionary that contains info about combinable analyses, e.g. { “anaid1”: ( “anaid2”, “anaid3” ) } optionally specifying signal regions, e.g. { “anaid1:SR1”: ( “anaid2:SR2”, “anaid3” ) }
-
createLinksToCombinationsMatrix
()[source]¶ in all globalInfo objects, create a shallow link to the combinations matrix
-
databaseParticles
¶ Database particles, a list, one entry per sub
-
databaseVersion
¶ The version of the database, concatenation of the individual versions
-
expResultList
¶ The combined list, compiled from the individual lists
-
getExpResults
(analysisIDs=['all'], datasetIDs=['all'], txnames=['all'], dataTypes=['all'], useSuperseded=False, useNonValidated=False, onlyWithExpected=False)[source]¶ Returns a list of ExpResult objects.
Each object refers to an analysisID containing one (for UL) or more (for Efficiency maps) dataset (signal region) and each dataset containing one or more TxNames. If analysisIDs is defined, returns only the results matching one of the IDs in the list. If dataTypes is defined, returns only the results matching a dataType in the list. If datasetIDs is defined, returns only the results matching one of the IDs in the list. If txname is defined, returns only the results matching one of the Tx names in the list.
Parameters: - analysisIDs – list of analysis ids ([CMS-SUS-13-006,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>] Furthermore, the centre-of-mass energy can be chosen as suffix, e.g. “:13*TeV”. Note that the asterisk in the suffix is not a wildcard.
- datasetIDs – list of dataset ids ([ANA-CUT0,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
- txnames – list of txnames ([TChiWZ,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
- dataTypes – dataType of the analysis (all, efficiencyMap or upperLimit) Can be wildcarded with usual shell wildcards: * ? [<letters>]
- useSuperseded – If False, the supersededBy results will not be included (deprecated)
- useNonValidated – If False, the results with validated = False will not be included
- onlyWithExpected – Return only those results that have expected values also. Note that this is trivially fulfilled for all efficiency maps.
Returns: list of ExpResult objects or the ExpResult object if the list contains only one result
-
pcl_meta
¶ The meta info of the text version, a merger of the original ones
-
txt_meta
¶ The meta info of the text version, a merger of the original ones
-
class
experiment.databaseObj.
ExpResultList
(expResList)[source]¶ Bases:
object
Holds a list of ExpResult objects for printout.
Parameters: expResultList – list of ExpResult objects
-
class
experiment.databaseObj.
SubDatabase
(base=None, force_load=None, discard_zeroes=True, progressbar=False, subpickle=True, combinationsmatrix=None)[source]¶ Bases:
object
SubDatabase object. Holds a list of ExpResult objects.
Parameters: - base – path to the database, or pickle file (string), or http address. If None, “official”, or “official_fastlim”, use the official database for your code version (including fastlim results, if specified). If “latest”, or “latest_fastlim”, check for the latest database. Multiple databases may be named, use “+” as delimiter. Order matters: Results with same name will overwritten according to sequence
- force_load – force loading the text database (“txt”), or binary database (“pcl”), dont force anything if None
- discard_zeroes – discard txnames with only zeroes as entries.
- progressbar – show a progressbar when building pickle file (needs the python-progressbar module)
- subpickle – produce small pickle files per exp result. Should only be used when working on the database.
- combinationsmatrix – an optional dictionary that contains info about combinable analyses, e.g. { “anaid1”: ( “anaid2”, “anaid3” ) } optionally specifying signal regions, e.g. { “anaid1:SR1”: ( “anaid2:SR2”, “anaid3” ) }
-
base
¶ This is the path to the base directory.
-
checkPathName
(path, discard_zeroes)[source]¶ checks the path name, returns the base directory and the pickle file name. If path starts with http or ftp, fetch the description file and the database. returns the base directory and the pickle file name
-
createBinaryFile
(filename=None)[source]¶ create a pcl file from the text database, potentially overwriting an old pcl file.
-
createLinksToCombinationsMatrix
()[source]¶ in all globalInfo objects, create links to self.combinationsmatrix
-
databaseVersion
¶ The version of the database, read from the ‘version’ file.
-
fetchFromScratch
(path, store, discard_zeroes)[source]¶ fetch database from scratch, together with description. :param store: filename to store json file.
-
getExpResults
(analysisIDs=['all'], datasetIDs=['all'], txnames=['all'], dataTypes=['all'], useSuperseded=False, useNonValidated=False, onlyWithExpected=False)[source]¶ Returns a list of ExpResult objects.
Each object refers to an analysisID containing one (for UL) or more (for Efficiency maps) dataset (signal region) and each dataset containing one or more TxNames. If analysisIDs is defined, returns only the results matching one of the IDs in the list. If dataTypes is defined, returns only the results matching a dataType in the list. If datasetIDs is defined, returns only the results matching one of the IDs in the list. If txname is defined, returns only the results matching one of the Tx names in the list.
Parameters: - analysisIDs – list of analysis ids ([CMS-SUS-13-006,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>] Furthermore, the centre-of-mass energy can be chosen as suffix, e.g. “:13*TeV”. Note that the asterisk in the suffix is not a wildcard.
- datasetIDs – list of dataset ids ([ANA-CUT0,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
- txnames – list of txnames ([TChiWZ,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
- dataTypes – dataType of the analysis (all, efficiencyMap or upperLimit) Can be wildcarded with usual shell wildcards: * ? [<letters>]
- useSuperseded – If False, the supersededBy results will not be included (deprecated)
- useNonValidated – If False, the results with validated = False will not be included
- onlyWithExpected – Return only those results that have expected values also. Note that this is trivially fulfilled for all efficiency maps.
Returns: list of ExpResult objects or the ExpResult object if the list contains only one result
-
inNotebook
()[source]¶ Are we running within a notebook? Has an effect on the progressbar we wish to use.
-
loadBinaryFile
(lastm_only=False)[source]¶ Load a binary database, returning last modified, file count, database.
Parameters: lastm_only – if true, the database itself is not read. Returns: database object, or None, if lastm_only == True.
-
loadDatabase
()[source]¶ if no binary file is available, then load the database and create the binary file. if binary file is available, then check if it needs update, create new binary file, in case it does need an update.
experiment.datasetObj module¶
-
class
experiment.datasetObj.
CombinedDataSet
(expResult)[source]¶ Bases:
object
Holds the information for a combined dataset (used for combining multiple datasets).
-
getDataSet
(datasetID)[source]¶ Returns the dataset with the corresponding dataset ID. If the dataset is not found, returns None.
Parameters: datasetID – dataset ID (string) Returns: DataSet object if found, otherwise None.
-
-
class
experiment.datasetObj.
DataSet
(path=None, info=None, createInfo=True, discard_zeroes=True, databaseParticles=None)[source]¶ Bases:
object
Holds the information to a data set folder (TxName objects, dataInfo,…)
Parameters: discard_zeroes – discard txnames with zero-only results -
checkForRedundancy
(databaseParticles)[source]¶ In case of efficiency maps, check if any txnames have overlapping constraints. This would result in double counting, so we dont allow it.
-
chi2
(nsig, deltas_rel=0.2, marginalize=False)[source]¶ Computes the chi2 for a given number of observed events “nobs”, given number of signal events “nsig”, and error on signal “deltas”. nobs, expectedBG and bgError are part of dataInfo. :param nsig: predicted signal (float) :param deltas_rel: relative uncertainty in signal (float). Default value is 20%. :param marginalize: if true, marginalize nuisances. Else, profile them. :return: chi2 (float)
-
getAttributes
(showPrivate=False)[source]¶ Checks for all the fields/attributes it contains as well as the attributes of its objects if they belong to smodels.experiment.
Parameters: showPrivate – if True, also returns the protected fields (_field) Returns: list of field names (strings)
-
getEfficiencyFor
(txname, mass)[source]¶ Convenience function. Get efficiency for mass assuming no lifetime rescaling. Same as self.getTxName(txname).getEfficiencyFor(m)
-
getLumi
()[source]¶ Return the dataset luminosity. If not defined for the dataset, use the value defined in globalInfo.lumi.
-
getSRUpperLimit
(alpha=0.05, expected=False, compute=False, deltas_rel=0.2)[source]¶ Computes the 95% upper limit on the signal*efficiency for a given dataset (signal region). Only to be used for efficiency map type results.
Parameters: - alpha – Can be used to change the C.L. value. The default value is 0.05 (= 95% C.L.)
- expected – Compute expected limit ( i.e. Nobserved = NexpectedBG )
- deltas_rel – relative uncertainty in signal (float). Default value is 20%.
- compute – If True, the upper limit will be computed from expected and observed number of events. If False, the value listed in the database will be used instead.
Returns: upper limit value
-
getUpperLimitFor
(element=None, expected=False, txnames=None, compute=False, alpha=0.05, deltas_rel=0.2)[source]¶ Returns the upper limit for a given element (or mass) and txname. If the dataset hold an EM map result the upper limit is independent of the input txname or mass. For UL results if an Element object is given the corresponding upper limit will be rescaled according to the lifetimes of the element intermediate particles. On the other hand, if a mass is given, no rescaling will be applied.
Parameters: - txname – TxName object or txname string (only for UL-type results)
- element – Element object or mass array with units (only for UL-type results)
- alpha – Can be used to change the C.L. value. The default value is 0.05 (= 95% C.L.) (only for efficiency-map results)
- deltas_rel – relative uncertainty in signal (float). Default value is 20%.
- expected – Compute expected limit, i.e. Nobserved = NexpectedBG (only for efficiency-map results)
- compute – If True, the upper limit will be computed from expected and observed number of events. If False, the value listed in the database will be used instead.
Returns: upper limit (Unum object)
-
getValuesFor
(attribute)[source]¶ Returns a list for the possible values appearing in the ExpResult for the required attribute (sqrts,id,constraint,…). If there is a single value, returns the value itself.
Parameters: attribute – name of a field in the database (string). Returns: list of unique values for the attribute
-
isCombinableWith
(other)[source]¶ Function that reports if two datasets are mutually uncorrelated = combinable.
Parameters: other – datasetObj to compare self with
-
isGlobalFieldCombinableWith_
(other)[source]¶ check for ‘combinableWith’ fields in globalInfo, check if <other> matches. this check is at analysis level (not at dataset level).
Params other: a dataset to check against Returns: true, if pair is marked as combinable, else false
-
isLocalFieldCombinableWith_
(other)[source]¶ check for ‘combinableWith’ fields in globalInfo, check if <other> matches. this check is at dataset level (not at dataset level).
Params other: a dataset to check against Returns: true, if pair is marked as combinable, else false
-
likelihood
(nsig, deltas_rel=0.2, marginalize=False, expected=False)[source]¶ Computes the likelihood to observe nobs events, given a predicted signal “nsig”, assuming “deltas_rel” error on the signal efficiency. The values observedN, expectedBG, and bgError are part of dataInfo.
Parameters: - nsig – predicted signal (float)
- deltas_rel – relative uncertainty in signal (float). Default value is 20%.
- marginalize – if true, marginalize nuisances. Else, profile them.
- expected – Compute expected instead of observed likelihood
Returns: likelihood to observe nobs events (float)
-
lmax
(deltas_rel=0.2, marginalize=False, expected=False, allowNegativeSignals=False)[source]¶ Convenience function, computes the likelihood at nsig = observedN - expectedBG, assuming “deltas_rel” error on the signal efficiency. The values observedN, expectedBG, and bgError are part of dataInfo.
Parameters: - deltas_rel – relative uncertainty in signal (float). Default value is 20%.
- marginalize – if true, marginalize nuisances. Else, profile them.
- expected – Compute expected instead of observed likelihood
- allowNegativeSignals – if False, then negative nsigs are replaced with 0.
Returns: likelihood to observe nobs events (float)
-
experiment.defaultFinalStates module¶
experiment.exceptions module¶
experiment.expResultObj module¶
-
class
experiment.expResultObj.
ExpResult
(path=None, discard_zeroes=True, databaseParticles=None)[source]¶ Bases:
object
Object containing the information and data corresponding to an experimental result (experimental conference note or publication).
Parameters: - path – Path to the experimental result folder
- discard_zeroes – Discard maps with only zeroes
- databaseParticles – the model, i.e. the particle content
-
getAttributes
(showPrivate=False)[source]¶ Checks for all the fields/attributes it contains as well as the attributes of its objects if they belong to smodels.experiment.
Parameters: showPrivate – if True, also returns the protected fields (_field) Returns: list of field names (strings)
-
getEfficiencyFor
(txname, mass, dataset=None)[source]¶ Convenience function. Get the efficiency for a specific dataset for a a specific txname. Equivalent to: self.getDataset ( dataset ).getEfficiencyFor ( txname, mass )
-
getTxnameWith
(restrDict={})[source]¶ Returns a list of TxName objects satisfying the restrictions. The restrictions specified as a dictionary.
Parameters: restrDict – dictionary containing the fields and their allowed values. E.g. {‘txname’ : ‘T1’, ‘axes’ : ….} The dictionary values can be single entries or a list of values. For the fields not listed, all values are assumed to be allowed. Returns: list of TxName objects if more than one txname matches the selection criteria or a single TxName object, if only one matches the selection.
-
getUpperLimitFor
(dataID=None, alpha=0.05, expected=False, txname=None, mass=None, compute=False)[source]¶ Computes the 95% upper limit (UL) on the signal cross section according to the type of result. For an Efficiency Map type, returns the UL for the signal*efficiency for the given dataSet ID (signal region). For an Upper Limit type, returns the UL for the signal*BR for the given mass array and Txname.
Parameters: - dataID – dataset ID (string) (only for efficiency-map type results)
- alpha – Can be used to change the C.L. value. The default value is 0.05 (= 95% C.L.) (only for efficiency-map results)
- expected – Compute expected limit, i.e. Nobserved = NexpectedBG (only for efficiency-map results)
- txname – TxName object or txname string (only for UL-type results)
- mass – Mass array with units (only for UL-type results)
- compute – If True, the upper limit will be computed from expected and observed number of events. If False, the value listed in the database will be used instead.
Returns: upper limit (Unum object)
-
getValuesFor
(attribute)[source]¶ Returns a list for the possible values appearing in the ExpResult for the required attribute (sqrts,id,constraint,…). If there is a single value, returns the value itself.
Parameters: attribute – name of a field in the database (string). Returns: list of unique values for the attribute
experiment.infoObj module¶
-
class
experiment.infoObj.
Info
(path=None)[source]¶ Bases:
object
Holds the meta data information contained in a .txt file (luminosity, sqrts, experimentID,…). Its attributes are generated according to the lines in the .txt file which contain “info_tag: value”.
Parameters: path – path to the .txt file -
addInfo
(tag, value)[source]¶ Adds the info field labeled by tag with value value to the object.
Parameters: - tag – information label (string)
- value – value for the field in string format
-
experiment.metaObj module¶
-
class
experiment.metaObj.
Meta
(pathname, discard_zeroes=None, mtime=None, filecount=None, hasFastLim=None, databaseVersion=None, format_version=214, python='3.6.12 (default, Oct 19 2020, 15:18:45) n[GCC 7.5.0]')[source]¶ Bases:
object
Parameters: - pathname – filename of pickle file, or dirname of text files
- discard_zeroes – do we discard zeroes?
- mtime – last modification time stamps
- filecount – number of files
- hasFastLim – fastlim in the database?
- databaseVersion – version of database
- format_version – format version of pickle file
- python – python version
-
current_version
= 214¶ The Meta object holds all meta information regarding the database, like number of analyses, last time of modification, … This info is needed to understand if we have to re-pickle.
-
determineLastModified
(force=False)[source]¶ compute the last modified timestamp, plus count number of files. Only if text db
-
lastModifiedSubDir
(subdir)[source]¶ Return the last modified timestamp of subdir (working recursively) plus the number of files.
Parameters: - subdir – directory name that is checked
- lastm – the most recent timestamp so far, plus number of files
Returns: the most recent timestamp, and the number of files
-
needsUpdate
(current)[source]¶ do we need an update, with respect to <current>. so <current> is the text database, <self> the pcl.
experiment.txnameObj module¶
-
class
experiment.txnameObj.
Delaunay1D
(data)[source]¶ Bases:
object
Uses a 1D data array to interpolate the data. The attribute simplices is a list of N-1 pair of ints with the indices of the points forming the simplices (e.g. [[0,1],[1,2],[3,4],…]).
-
checkData
(data)[source]¶ Define the simplices according to data. Compute and store the transformation matrix and simplices self.point.
-
find_index
(xlist, x)[source]¶ Efficient way to find x in a list. Returns the index (i) of xlist such that xlist[i] < x <= xlist[i+1]. If x > max(xlist), returns the length of the list. If x < min(xlist), returns 0. vertices = np.take(self.tri.simplices, simplex, axis=0) temp = np.take(self.tri.transform, simplex, axis=0) d=temp.shape[2] delta = uvw - temp[:, d]
Parameters: - xlist – List of x-type objects
- x – object to be searched for.
Returns: Index of the list such that xlist[i] < x <= xlist[i+1].
-
-
class
experiment.txnameObj.
TxName
(path, globalObj, infoObj, databaseParticles)[source]¶ Bases:
object
Holds the information related to one txname in the Txname.txt file (constraint, condition,…) as well as the data.
-
addInfo
(tag, value)[source]¶ Adds the info field labeled by tag with value value to the object.
Parameters: - tag – information label (string)
- value – value for the field in string format
-
fetchAttribute
(attr, fillvalue=None)[source]¶ Auxiliary method to get the attribute from self. If not found, look for it in datasetInfo and if still not found look for it in globalInfo. If not found in either of the above, return fillvalue.
Parameters: - attr – Name of attribute (string)
- fillvalue – Value to be returned if attribute is not found.
Returns: Value of the attribute or fillvalue, if attribute was not found.
-
getEfficiencyFor
(element)[source]¶ For upper limit results, checks if the input element falls inside the upper limit grid and has a non-zero reweigthing factor. If it does, returns efficiency = 1, else returns efficiency = 0. For efficiency map results, returns the signal efficiency including the lifetime reweighting. If a mass array is given as input, no lifetime reweighting will be applied.
Parameters: element – Element object or mass array with units. Returns: efficiency (float)
-
getInfo
(infoLabel)[source]¶ Returns the value of info field.
Parameters: infoLabel – label of the info field (string). It must be an attribute of the TxNameInfo object
-
getMassVectorFromElement
(element)[source]¶ given element, extract the mass vector for the server query. element can be list of masses or “Element”
Returns: eg [[300,100],[300,100]]
-
getULFor
(element, expected=False)[source]¶ Returns the upper limit (or expected) for element (only for upperLimit-type). Includes the lifetime reweighting (ul/reweight). If called for efficiencyMap results raises an error. If a mass array is given as input, no lifetime reweighting will be applied.
Parameters: - element – Element object or mass array (with units)
- expected – look in self.txnameDataExp, not self.txnameData
-
hasElementAs
(element)[source]¶ Verify if the conditions or constraint in Txname contains the element. Check both branch orderings. If both orderings match, returns the one with the highest mass array.
Parameters: element – Element object Returns: A copy of the element on the correct branch ordering appearing in the Txname constraint or condition.
-
-
class
experiment.txnameObj.
TxNameData
(value, dataType, Id, accept_errors_upto=0.05, Leff_inner=None, Leff_outer=None)[source]¶ Bases:
object
Holds the data for the Txname object. It holds Upper limit values or efficiencies.
Parameters: - value – values in string format
- dataType – the dataType (upperLimit or efficiencyMap)
- Id – an identifier, must be unique for each TxNameData!
- _accept_errors_upto – If None, do not allow extrapolations outside of convex hull. If float value given, allow that much relative uncertainty on the upper limit / efficiency when extrapolating outside convex hull. This method can be used to loosen the equal branches assumption.
- Leff_inner – is the effective inner radius of the detector, given in meters (used for reweighting prompt decays). If None, default values will be used.
- Leff_outer – is the effective outer radius of the detector, given in meters (used for reweighting decays outside the detector). If None, default values will be used.
-
computeV
(values)[source]¶ Compute rotation matrix _V, and triangulation self.tri
Parameters: values – Nested array with the data values without units
-
coordinatesToData
(point, rotMatrix=None, transVector=None)[source]¶ A function that return the original mass and width array (including the widths as tuples) for a given point in PCA space (inverse of dataToCoordinates).
Parameters: - point – Point in PCA space (1D list with size equal to self.full_dimensionality or self.dimensionality)
- rotMatrix – Rotation matrix for PCA (e.g. self._V). If None, no rotation is performed.
- transVector – Translation vector for PCA (e.g. self.delta_x). If None no translation is performed
Returns: nested mass array including the widths as tuples (e.g. [[(200,1e-10),100],[(200,1e-10),100]])
-
dataToCoordinates
(dataPoint, rotMatrix=None, transVector=None)[source]¶ Format a dataPoint to the format used for interpolation. All the units are removed, the widths are rescaled and the masses and widths are combined in a flat array. The input can be an Element object or a massAndWidth nested arrays (with tuples to store the relevant widths).
Parameters: - dataPoint – Element object from which the mass and width arrays will be extracted or a nested mass array from the database, which contain tuples to include the width values
- rotMatrix – Rotation matrix for PCA (e.g. self._V). If None, no rotation is performed.
- transVector – Translation vector for PCA (e.g. self.delta_x). If None no translation is performed
Returns: Point (list of floats)
-
getDataShape
(value)[source]¶ Stores the data format (mass shape) and store it for future use. If there are inclusive objects (mass or branch = None), store their positions.
Parameters: value – list of data points
-
getUnits
(value)[source]¶ Get standard units for the input object. Uses the units defined in physicsUnits.standardUnits. (e.g. [[100*GeV,100.*GeV],3.*pb] -> returns [[GeV,GeV],fb] [[100*GeV,3.],[200.*GeV,2.*pb]] -> returns [[GeV,1.],[GeV,fb]] )
Parameters: value – Object containing units (e.g. [[100*GeV,100.*GeV],3.*pb]) Returns: Object with same structure containing the standard units used to normalize the data.
-
getValueFor
(element)[source]¶ Interpolates the value and returns the UL or efficiency for the respective element rescaled according to the reweighting function self.reweightF. For UL-type data the default rescaling is ul -> ul/(fraction of prompt decays) and for EM-type data it is eff -> eff*(fraction of prompt decays). If a mass array is given as input, no lifetime reweighting will be applied.
Parameters: element – Element object or mass array (with units)
-
getValueForPoint
(point)[source]¶ Returns the UL or efficiency for the point (in coordinates) using interpolation
Parameters: point – Point in coordinate space (length = self.full_dimensionality) Returns: Value of UL or efficiency (float) without units
-
getWidthPosition
(value)[source]¶ Gets the positions of the widths to be used for interpolation.
Parameters: value – data point Returns: A list with the position of the widths. A position is a tuple of the form (branch-index,vertex-index).
-
interpolate
(point, fill_value=nan)[source]¶ Returns the interpolated value for the point (in coordinates)
Parameters: point – Point in coordinate space (length = self.dimensionality) Returns: Value for point without units