API¶
Welcome to the API. The main packages and modules are available in the ‘Quick access’ below.
All functions are defined in details further down the page.
Quick access¶
dicom_converter¶
add_metadata¶
Created on Tue Sep 7 10:06:08 2021
@author: eloyen
- dicom_converter.add_metadata.add_label_in_dcm(dcmFile, label, tag)¶
Add the label metadata in the DICOM file
- Parameters
- dcmFilestring
/…/dicominfo.json
- labelshort string
ex: ‘0’, ‘1’
- tagtuple of two elements
DICOM tag, must be in hexagonal format. ex: (0x10,0x20)
- Returns
- dcmFilestring
dcmFile with the new added metadata
- dicom_converter.add_metadata.go_through_folder(folderPath, label, tag)¶
Go trough the folder to add the label tag metadata
- Parameters
- folderPathstring
/…/dicoms/
- labelshort string
ex: ‘0’, ‘1’
- tagtuple of two elements
DICOM tag, must be in hexagonal format. ex: (0x10,0x20)
- Returns
- None.
classify_data¶
Created on Tue Sep 7 13:58:39 2021
@author: eloyen
- dicom_converter.classify_data.classify_in_labelled_folders(inputFolder, labelTag, outputDir)¶
Classify the IMAGES vs METADATA in folders according to the label tag
- Parameters
- inputFolderstring
/…/dicoms/
- labelTagtuple of two elements
DICOM tag, must be in hexagonal format. ex: (0x10,0x20)
- outputDirstring
/…/outputs
- Returns
- ——-
- None.
- dicom_converter.classify_data.get_tag_from_json(json_path, tag)¶
Get tag value from .json fiel containing DICOM metadata
- Parameters
- json_pathstring
/…/dicominfo.json
- tagtuple of two elements
DICOM tag, must be in hexagonal format. ex: (0x10,0x20)
- Returns
- valueValue stored in tag
utils¶
cat_to_dataset¶
- dicom_converter.utils.cat_to_dataset.cat_to_dataset(datapath='', trainset_percentage=0.7, validset_percentage=0.2, testset_percentage=0.1, seed=3)¶
Divides a directory containing only contain folders with each category to classify into ‘train’, ‘valid’ and ‘test’ folders contain each category for ML with custom splitting percentages trainset_percentage + validset_percentage + testset_percentage should be = 1 !
- Parameters
- datapathstring
Datapath directory should only contain folders with each category to classify. Each category fodler must only contain image files of that category.. The default is “”.
- trainset_percentagefloat
Training set percentage (trainset_percentage + validset_percentage + testset_percentage should be = 1). The default is 0.7.
- validset_percentagefloat
Validation set percentage (trainset_percentage + validset_percentage + testset_percentage should be = 1). The default is 0.2.
- testset_percentagefloat
Test set percentage (trainset_percentage + validset_percentage + testset_percentage should be = 1). The default is 0.1.
- seedinteger
Random seed. The default is 3.
- Returns
- None.
- dicom_converter.utils.cat_to_dataset.dataset_to_cat(datapath='')¶
Groups a directory containing a ‘train’, ‘test’ and ‘valid’ (code is case sensitive) folders for ML with category sub folders into category folders.
- Parameters
- datapathstring
Datapath directory should only contain a ‘train’, ‘test’ and ‘valid’ (code is case sensitive) folders. Each fodler should only contain folders with each category to classify. Each category fodler must only contain image files of that category. The default is “”.
- Returns
- None.
dicom_to_img¶
Created on Wed Sep 1 15:44:11 2021
Module with functions used to convert DICOM files (.dcm) to .png/.bmp and .json files and from .png/.bmp to DICOM.
The function ‘compress_to_png’ calls executables from OpenJPEG (https://www.openjpeg.org/) available at ‘https://github.com/uclouvain/openjpeg/releases/tag/v2.4.0’.
- dicom_converter.utils.dicom_to_img.compress_to_png(file_path, software_root, compress_ratio=1)¶
Compresses an image to a .png with a specified ‘compress_ratio’
- Parameters
- file_pathstring
/…/filename.png
- software_rootstring
Path to the folder containing the openjpeg .exe programs
- compress_ratioint, optional
Best if multiple of 8. The default is 1.
- Returns
- None.
- dicom_converter.utils.dicom_to_img.decompose_dicom(file_path, output_path, img_format='bmp', removeImgInJson=False)¶
Divides dicom file into a .json file with the dicom metadata and a .’img_format’ file containing the image.
- Parameters
- file_pathstring
/…/filename.dcm
- output_pathstring
/…/foldername/
- img_formatstring, optional
Image file format : bmp, png, … The default is ‘bmp’.
- removeImgInJsonTrue/False, optional
Removes PixelData from dicom metadata. The default is False.
- Returns
- None.
- dicom_converter.utils.dicom_to_img.dicom_from_img_or_json(file_path, output_folder, metadata_path=None, randomizeName=False, verbose=False)¶
Creates a dicom from a .json file or a .png/.bmp file
- Parameters
- file_pathstring
/…/filename.png|.json
- output_pathstring
/…/foldername/
- metadata_pathstring, optional
path to reference dicom file. The default is dcm_file_path.
- randomizeNameTrue/False
Creates a random patientID, optional
- verboseTrue/False, optional
Raises warnings. The default is False.
- Returns
- None.
- dicom_converter.utils.dicom_to_img.get_tag_from_json(json_path, tag)¶
Get tag value from .json fiel containing DICOM metadata
- Parameters
- json_pathstring
/…/dicominfo.json
- tagtuple of two elements
DICOM tag, must be in hexagonal format. ex: (0x10,0x20)
- Returns
- valueValue stored in tag
- dicom_converter.utils.dicom_to_img.img_from_dicom(ds)¶
Extract array from dicom dataset ‘dcm’ with [0,256] pixel intensities.
- Parameters
- dcmFileDataset object of pydicom.dataset module
- Returns
- darray
Image array of the dicom dataset
hospital_split¶
Splits a large dataset (grouped by category) into small datasets with custom individual sizes
- dicom_converter.utils.hospital_split.hospital_split(number_of_datasplits=3, split_percentages=array([0.5, 0.3, 0.2]), seed=3, datapath='')¶
- Parameters
- number_of_datasplitsinteger
Number of data splits (hospitals needed). The default is 3.
- split_percentagesarray
split percentage by dataset (hospital) the number of components must be equal to ‘number_of_datasplits’. Be careful that the sum of them is exactly = 1. The default is np.array([0.5,0.3,0.2]).
- seedinteger
Random seed used. The default is 3.
- datapathstring
Datapath directory should only contain folders with each category to classify. Each category fodler must only contain image files of that category. The default is “”.
- Returns
- None.
dicom_pseudonymizer¶
anonymizer¶
- dicom_pseudonymizer.anonymizer.anonymize(input_path: str, output_path: str, lookup_path: str, anonymization_actions: dict, delete_private_tags: bool, rename_files: bool) None ¶
Read data from input path (folder or file) and launch the anonymization.
- Parameters
- input_pathstr
Path to a folder or to a file. If set to a folder, then cross all over subfiles and apply anonymization.
- output_pathstr
Path to a folder or to a file.
- csv_pathstr
Path to lookup table csv path.
- anonymization_actionsdict
List of actions that will be applied on tags.
- delete_private_tagsbool
Whether to delete private tags.
- rename_filesbool
Whether to remane output files with pseudo.
- Returns
- None.
- dicom_pseudonymizer.anonymizer.generate_actions_dictionary(map_action_tag, defined_action_map={}) dict ¶
Generate a new dictionary which maps actions function to tags
- Parameters
- map_action_tagdict
Link actions to tags
- defined_action_mapdict
Link action name to action function
- Returns
- generated_mapdict.
The map of actions.
- dicom_pseudonymizer.anonymizer.main(defined_action_map={})¶
utils¶
dicom_fields¶
Tags anonymized in DICOM standard Documentation for groups meaning can be found in default associated actions. http://dicom.nema.org/dicom/2013/output/chtml/part15/chapter_E.html#table_E.1-1
This code was taken and adapted from https://github.com/KitwareMedical/dicom-anonymizer
format_tag¶
Utility for printing the tags in the original hex format.
This code was taken and adapted from https://github.com/KitwareMedical/dicom-anonymizer
- dicom_pseudonymizer.utils.format_tag.hex_to_string(x: hex)¶
Convert a tag number to it’s original hex string. E.g. if a tag has the hex number 0x0008, it becomes 8, and we then convert it back to 0x0008 (as a string).
- Parameters
- xhex
The hex number to be converted.
- Returns
- sstr
The hex tag converted to hex number string.
- dicom_pseudonymizer.utils.format_tag.tag_to_hex_strings(tag: tuple)¶
Convert a tag tuple to a tuple of full hex number strings.
E.g. (0x0008, 0x0010) is evaluated as (8, 16) by python. So we convert it back to a string ‘(0x0008, 0x0010)’ for pretty printing.
- Parameters
- tagtuple
The tuple to be converted from hex numbers to hex number strings.
- Returns
- stuple
The hex tag converted to hex number string.
simple_dicomanonymizer¶
- dicom_pseudonymizer.utils.simple_dicomanonymizer.anonymize_dataset(dataset: pydicom.dataset.Dataset, extra_anonymization_rules: Optional[dict] = None, delete_private_tags: bool = True) None ¶
Anonymize a pydicom Dataset by using anonymization rules which links an action to a tag
- Parameters
- datasetFileDataset object of pydicom.dataset module
Dataset to be anonymized
- extra_anonymization_rulesdict
Rules to be applied on the dataset
- delete_private_tagsbool
Define if private tags should be delete or not
- Returns
- None.
- dicom_pseudonymizer.utils.simple_dicomanonymizer.anonymize_dicom_file(in_file: str, out_file: str, lookup_file: Optional[str] = None, extra_anonymization_rules: Optional[dict] = None, delete_private_tags: bool = True, rename_files: bool = False) None ¶
Anonymize a DICOM file by modifying personal tags
Conforms to DICOM standard except for customer specificities.
- Parameters
- in_filestr
File path or file-like object to read from
- out_filestr
File path or file-like object to write to
- lookup_filestr
File path to the lookup table.
- extra_anonymization_rulesstr
Add more tag’s actions
- delete_private_tagsbool
Define if private tags should be delete or not
- Returns
- d: dict
The generated dictionary with the action to be applied.
- dicom_pseudonymizer.utils.simple_dicomanonymizer.clean(dataset, tag)¶
C - clean, that is replace with values of similar meaning known not to contain identifying information and consistent with the VR
- dicom_pseudonymizer.utils.simple_dicomanonymizer.delete(dataset, tag)¶
X - remove
- dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_element(dataset, element)¶
Delete the element from the dataset. If VR’s element is a date, then it will be replaced by 00010101
- dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_empty(dataset, tag)¶
X/Z - X unless Z is required to maintain IOD conformance (Type 3 versus Type 2)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_empty_or_replace(dataset, tag)¶
X/Z/D - X unless Z or D is required to maintain IOD conformance (Type 3 versus Type 2 versus Type 1)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_empty_or_replace_UID(dataset, tag)¶
X/Z/U* - X unless Z or replacement of contained instance UIDs (U) is required to maintain IOD conformance (Type 3 versus Type 2 versus Type 1 sequences containing UID references)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_replace(dataset, tag)¶
X/D - X unless D is required to maintain IOD conformance (Type 3 versus Type 1)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.empty(dataset, tag)¶
Z - replace with a zero length value, or a non-zero length value that may be a dummy value and consistent with the VR
- dicom_pseudonymizer.utils.simple_dicomanonymizer.empty_element(element)¶
Clean element according to the element’s VR: - SH, PN, UI, LO, CS: value will be set to ‘’ - DA: value will be replaced by ‘00010101’ - TM: value will be replaced by ‘000000.00’ - UL: value will be replaced by 0 - SQ: all subelement will be called with “empty_element”
- dicom_pseudonymizer.utils.simple_dicomanonymizer.empty_or_replace(dataset, tag)¶
Z/D - Z unless D is required to maintain IOD conformance (Type 2 versus Type 1)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.generate_actions(tag_list: list, action, options: Optional[dict] = None) dict ¶
Generate a dictionary using list values as tag and assign the same value to all
- Parameters
- tag_listlist
List of tags which will have the same associated actions
- actionfunction
Define the action that will be use. It can be a callable custom function or a name of a pre-defined action from simpledicomanonymizer
- optionsdict
Define options tht will be affected to the action (like regexp)
- Returns
- d: dict
The generated dictionary with the action to be applied.
- dicom_pseudonymizer.utils.simple_dicomanonymizer.get_private_tag(dataset, tag)¶
Get the creator and element from tag
- Parameters
- datasetFileDataset object of pydicom.dataset module
Dicom dataset
- tagtuple
Tag from which we want to extract private information
- Returns
- ddict
Dictionary with creator of the tag and tag element (which contains element + offset)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.get_private_tags(anonymization_actions: dict, dataset: pydicom.dataset.Dataset) List[dict] ¶
Extract private tag as a list of object with creator and element
- Parameters
- anonymization_actionsdict
List of tags associated to an action.
- datasetFileDataset object of pydicom.dataset module
Dicom dataset which will be anonymize and contains all private tags
- Returns
- dArray of object
Array with private tags
- dicom_pseudonymizer.utils.simple_dicomanonymizer.initialize_actions() dict ¶
Initialize anonymization actions with DICOM standard values
- Parameters
- None.
- Returns
- d: dict
Dict object which map actions to tags
- dicom_pseudonymizer.utils.simple_dicomanonymizer.keep(dataset, tag)¶
K - keep (unchanged for non-sequence attributes, cleaned for sequences)
- dicom_pseudonymizer.utils.simple_dicomanonymizer.regexp(options: dict)¶
Apply a regexp method to the dataset
- Parameters
- optionsdict
- Contains two values:
find: which string should be found
replace: string that will replace the found string
- Returns
- sstring
The string with the regexp applied.
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace(dataset, tag)¶
D - replace with a non-zero length value that may be a dummy value and consistent with the VR
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_UID(dataset, tag)¶
U - replace with a non-zero length UID that is internally consistent within a set of Instances Lazy solution : Replace with empty string
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_and_keep_correspondence(dataset, tag)¶
P - addition to pseudonimize the code and keep a lookup table. If used, it should be called when tag (0x0010, 0x0020) (PatientID) is encountered. It also replaces implicitly the tag (0x0008, 0x0050) (AccessionNumber). A lookup table (csv file) is create with columns: ‘old_patient_id’, ‘new_patient_id’, ‘old_accession_number’, ‘new_accession_number’
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element(element)¶
Replace element’s value according to it’s VR: - DA: cf replace_element_date - TM: replace with ‘000000.00’ - LO, SH, PN, CS: replace with ‘Anonymized’ - UI: cf replace_element_UID - IS: replace with ‘0’ - FD, FL, SS, US: replace with 0 - ST: replace with ‘’ - SQ: call replace_element for all sub elements - DT: cf replace_element_date_time
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element_UID(element)¶
Keep char value but replace char number with random number The replaced value is kept in a dictionary link to the initial element.value in order to automatically apply the same replaced value if we have an other UID with the same value
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element_date(element)¶
Replace date element’s value with ‘00010101’
- dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element_date_time(element)¶
Replace date time element’s value with ‘00010101010101.000000+0000’
federated_learning¶
client¶
- federated_learning.client.client.as_tensor(data, dtype=None, device=None) Tensor ¶
Convert the data into a torch.Tensor. If the data is already a Tensor with the same dtype and device, no copy will be performed, otherwise a new Tensor will be returned with computational graph retained if data Tensor has
requires_grad=True
. Similarly, if the data is anndarray
of the corresponding dtype and the device is the cpu, no copy will be performed.- Args:
- data (array_like): Initial data for the tensor. Can be a list, tuple,
NumPy
ndarray
, scalar, and other types.- dtype (
torch.dtype
, optional): the desired data type of returned tensor. Default: if
None
, infers data type fromdata
.- device (
torch.device
, optional): the desired device of returned tensor. Default: if
None
, uses the current device for the default tensor type (seetorch.set_default_tensor_type()
).device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
Example:
>>> a = numpy.array([1, 2, 3]) >>> t = torch.as_tensor(a) >>> t tensor([ 1, 2, 3]) >>> t[0] = -1 >>> a array([-1, 2, 3]) >>> a = numpy.array([1, 2, 3]) >>> t = torch.as_tensor(a, device=torch.device('cuda')) >>> t tensor([ 1, 2, 3]) >>> t[0] = -1 >>> a array([1, 2, 3])
- federated_learning.client.client.main(arch: str <Architecture to use> = 'resnet18', lr: int <Learning Rate> = 0.0006000000000000001, epochs: int <number of epochs for training> = 5, bs: int <Batch size to use> = 64, device: str <Which device to use> = 'cuda:0', port: int <The port used for federated learning> = 8080, apply_dp: <Learning rate> = True, alphas: range <Alphas> = range(2, 32), noise_multiplier: int <Noise injected in DP> = 0.5, max_grad_norm: int <Maximum Gradient Norm when clipping> = 1.0, delta: int <Delta> = 1e-05, matrix_path: str <Pass a value to save the confusion matrix> = None, csv_path: str <Pass a value to store the logs in csv> = None, roc_path: str <Pass a value to store the ROC-AUC curve> = None, data_path: str <datapath to use> = '../../../../Hospitals/H0', seed: int <Pass a value to set seed> = 42)¶
dsail.differential_privacy¶
dsail.federated_learning¶
- class federated_learning.client.dsail.federated_learning.FLClient(learn, lr, ep, apply_dp, alphas, noise_multiplier, max_grad_norm, delta, device, csv_path, data_path, matrix_path, roc_path)¶
Bases:
object
- evaluate(parameters, config)¶
- fit(parameters, config)¶
- get_parameters()¶
- set_parameters(parameters)¶
dsail.utils¶
- class federated_learning.client.dsail.utils.ImbalancedDatasetSampler(dataset, indices: Optional[list] = None, num_samples: Optional[int] = None, callback_get_label: Optional[Callable] = None)¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]Samples elements randomly from a given list of indices for imbalanced dataset
- Parameters
- indices: list
a list of indices
- num_samples: int
number of samples to draw
- callback_get_label: Callable
a callback-like function which takes two arguments - dataset and index
- federated_learning.client.dsail.utils.get_imbalance_weights(ds)¶
- federated_learning.client.dsail.utils.save_matrix(learn, path)¶
- federated_learning.client.dsail.utils.save_roc(learn, path)¶
- federated_learning.client.dsail.utils.set_seed(dls, seed)¶
server¶
- class federated_learning.server.server.SaveModelStrategy(*args, **kwargs)¶
Bases:
flwr.server.strategy.fedavg.FedAvg
- federated_learning.server.server.main(fraction_fit: float <The fraction of available client used for training> = 1.0, min_fit_clients: int <The minimum number of clients used to start training> = 3, min_available_clients: int <The minimum number of clients used to start server> = 3, min_eval_clients: int <The minimum number of clients used for evaluation> = 3, num_rounds: int <The number of rounds of training> = 3, save_path: str <Set a path to save weights> = None)¶