API

Welcome to the API. The main packages and modules are available in the ‘Quick access’ below.

All functions are defined in details further down the page.

dicom_converter

add_metadata

Created on Tue Sep 7 10:06:08 2021

@author: eloyen

dicom_converter.add_metadata.add_label_in_dcm(dcmFile, label, tag)

Add the label metadata in the DICOM file

Parameters
dcmFilestring

/…/dicominfo.json

labelshort string

ex: ‘0’, ‘1’

tagtuple of two elements

DICOM tag, must be in hexagonal format. ex: (0x10,0x20)

Returns
dcmFilestring

dcmFile with the new added metadata

dicom_converter.add_metadata.go_through_folder(folderPath, label, tag)

Go trough the folder to add the label tag metadata

Parameters
folderPathstring

/…/dicoms/

labelshort string

ex: ‘0’, ‘1’

tagtuple of two elements

DICOM tag, must be in hexagonal format. ex: (0x10,0x20)

Returns
None.

classify_data

Created on Tue Sep 7 13:58:39 2021

@author: eloyen

dicom_converter.classify_data.classify_in_labelled_folders(inputFolder, labelTag, outputDir)

Classify the IMAGES vs METADATA in folders according to the label tag

Parameters
inputFolderstring

/…/dicoms/

labelTagtuple of two elements

DICOM tag, must be in hexagonal format. ex: (0x10,0x20)

outputDirstring

/…/outputs

Returns
——-
None.
dicom_converter.classify_data.get_tag_from_json(json_path, tag)

Get tag value from .json fiel containing DICOM metadata

Parameters
json_pathstring

/…/dicominfo.json

tagtuple of two elements

DICOM tag, must be in hexagonal format. ex: (0x10,0x20)

Returns
valueValue stored in tag

utils

cat_to_dataset

dicom_converter.utils.cat_to_dataset.cat_to_dataset(datapath='', trainset_percentage=0.7, validset_percentage=0.2, testset_percentage=0.1, seed=3)

Divides a directory containing only contain folders with each category to classify into ‘train’, ‘valid’ and ‘test’ folders contain each category for ML with custom splitting percentages trainset_percentage + validset_percentage + testset_percentage should be = 1 !

Parameters
datapathstring

Datapath directory should only contain folders with each category to classify. Each category fodler must only contain image files of that category.. The default is “”.

trainset_percentagefloat

Training set percentage (trainset_percentage + validset_percentage + testset_percentage should be = 1). The default is 0.7.

validset_percentagefloat

Validation set percentage (trainset_percentage + validset_percentage + testset_percentage should be = 1). The default is 0.2.

testset_percentagefloat

Test set percentage (trainset_percentage + validset_percentage + testset_percentage should be = 1). The default is 0.1.

seedinteger

Random seed. The default is 3.

Returns
None.
dicom_converter.utils.cat_to_dataset.dataset_to_cat(datapath='')

Groups a directory containing a ‘train’, ‘test’ and ‘valid’ (code is case sensitive) folders for ML with category sub folders into category folders.

Parameters
datapathstring

Datapath directory should only contain a ‘train’, ‘test’ and ‘valid’ (code is case sensitive) folders. Each fodler should only contain folders with each category to classify. Each category fodler must only contain image files of that category. The default is “”.

Returns
None.

dicom_to_img

Created on Wed Sep 1 15:44:11 2021

Module with functions used to convert DICOM files (.dcm) to .png/.bmp and .json files and from .png/.bmp to DICOM.

The function ‘compress_to_png’ calls executables from OpenJPEG (https://www.openjpeg.org/) available at ‘https://github.com/uclouvain/openjpeg/releases/tag/v2.4.0’.

dicom_converter.utils.dicom_to_img.compress_to_png(file_path, software_root, compress_ratio=1)

Compresses an image to a .png with a specified ‘compress_ratio’

Parameters
file_pathstring

/…/filename.png

software_rootstring

Path to the folder containing the openjpeg .exe programs

compress_ratioint, optional

Best if multiple of 8. The default is 1.

Returns
None.
dicom_converter.utils.dicom_to_img.decompose_dicom(file_path, output_path, img_format='bmp', removeImgInJson=False)

Divides dicom file into a .json file with the dicom metadata and a .’img_format’ file containing the image.

Parameters
file_pathstring

/…/filename.dcm

output_pathstring

/…/foldername/

img_formatstring, optional

Image file format : bmp, png, … The default is ‘bmp’.

removeImgInJsonTrue/False, optional

Removes PixelData from dicom metadata. The default is False.

Returns
None.
dicom_converter.utils.dicom_to_img.dicom_from_img_or_json(file_path, output_folder, metadata_path=None, randomizeName=False, verbose=False)

Creates a dicom from a .json file or a .png/.bmp file

Parameters
file_pathstring

/…/filename.png|.json

output_pathstring

/…/foldername/

metadata_pathstring, optional

path to reference dicom file. The default is dcm_file_path.

randomizeNameTrue/False

Creates a random patientID, optional

verboseTrue/False, optional

Raises warnings. The default is False.

Returns
None.
dicom_converter.utils.dicom_to_img.get_tag_from_json(json_path, tag)

Get tag value from .json fiel containing DICOM metadata

Parameters
json_pathstring

/…/dicominfo.json

tagtuple of two elements

DICOM tag, must be in hexagonal format. ex: (0x10,0x20)

Returns
valueValue stored in tag
dicom_converter.utils.dicom_to_img.img_from_dicom(ds)

Extract array from dicom dataset ‘dcm’ with [0,256] pixel intensities.

Parameters
dcmFileDataset object of pydicom.dataset module
Returns
darray

Image array of the dicom dataset

hospital_split

Splits a large dataset (grouped by category) into small datasets with custom individual sizes

dicom_converter.utils.hospital_split.hospital_split(number_of_datasplits=3, split_percentages=array([0.5, 0.3, 0.2]), seed=3, datapath='')
Parameters
number_of_datasplitsinteger

Number of data splits (hospitals needed). The default is 3.

split_percentagesarray

split percentage by dataset (hospital) the number of components must be equal to ‘number_of_datasplits’. Be careful that the sum of them is exactly = 1. The default is np.array([0.5,0.3,0.2]).

seedinteger

Random seed used. The default is 3.

datapathstring

Datapath directory should only contain folders with each category to classify. Each category fodler must only contain image files of that category. The default is “”.

Returns
None.

dicom_pseudonymizer

anonymizer

dicom_pseudonymizer.anonymizer.anonymize(input_path: str, output_path: str, lookup_path: str, anonymization_actions: dict, delete_private_tags: bool, rename_files: bool) None

Read data from input path (folder or file) and launch the anonymization.

Parameters
input_pathstr

Path to a folder or to a file. If set to a folder, then cross all over subfiles and apply anonymization.

output_pathstr

Path to a folder or to a file.

csv_pathstr

Path to lookup table csv path.

anonymization_actionsdict

List of actions that will be applied on tags.

delete_private_tagsbool

Whether to delete private tags.

rename_filesbool

Whether to remane output files with pseudo.

Returns
None.
dicom_pseudonymizer.anonymizer.generate_actions_dictionary(map_action_tag, defined_action_map={}) dict

Generate a new dictionary which maps actions function to tags

Parameters
map_action_tagdict

Link actions to tags

defined_action_mapdict

Link action name to action function

Returns
generated_mapdict.

The map of actions.

dicom_pseudonymizer.anonymizer.main(defined_action_map={})

utils

dicom_fields

Tags anonymized in DICOM standard Documentation for groups meaning can be found in default associated actions. http://dicom.nema.org/dicom/2013/output/chtml/part15/chapter_E.html#table_E.1-1

This code was taken and adapted from https://github.com/KitwareMedical/dicom-anonymizer

format_tag

Utility for printing the tags in the original hex format.

This code was taken and adapted from https://github.com/KitwareMedical/dicom-anonymizer

dicom_pseudonymizer.utils.format_tag.hex_to_string(x: hex)

Convert a tag number to it’s original hex string. E.g. if a tag has the hex number 0x0008, it becomes 8, and we then convert it back to 0x0008 (as a string).

Parameters
xhex

The hex number to be converted.

Returns
sstr

The hex tag converted to hex number string.

dicom_pseudonymizer.utils.format_tag.tag_to_hex_strings(tag: tuple)

Convert a tag tuple to a tuple of full hex number strings.

E.g. (0x0008, 0x0010) is evaluated as (8, 16) by python. So we convert it back to a string ‘(0x0008, 0x0010)’ for pretty printing.

Parameters
tagtuple

The tuple to be converted from hex numbers to hex number strings.

Returns
stuple

The hex tag converted to hex number string.

simple_dicomanonymizer

dicom_pseudonymizer.utils.simple_dicomanonymizer.anonymize_dataset(dataset: pydicom.dataset.Dataset, extra_anonymization_rules: Optional[dict] = None, delete_private_tags: bool = True) None

Anonymize a pydicom Dataset by using anonymization rules which links an action to a tag

Parameters
datasetFileDataset object of pydicom.dataset module

Dataset to be anonymized

extra_anonymization_rulesdict

Rules to be applied on the dataset

delete_private_tagsbool

Define if private tags should be delete or not

Returns
None.
dicom_pseudonymizer.utils.simple_dicomanonymizer.anonymize_dicom_file(in_file: str, out_file: str, lookup_file: Optional[str] = None, extra_anonymization_rules: Optional[dict] = None, delete_private_tags: bool = True, rename_files: bool = False) None

Anonymize a DICOM file by modifying personal tags

Conforms to DICOM standard except for customer specificities.

Parameters
in_filestr

File path or file-like object to read from

out_filestr

File path or file-like object to write to

lookup_filestr

File path to the lookup table.

extra_anonymization_rulesstr

Add more tag’s actions

delete_private_tagsbool

Define if private tags should be delete or not

Returns
d: dict

The generated dictionary with the action to be applied.

dicom_pseudonymizer.utils.simple_dicomanonymizer.clean(dataset, tag)

C - clean, that is replace with values of similar meaning known not to contain identifying information and consistent with the VR

dicom_pseudonymizer.utils.simple_dicomanonymizer.delete(dataset, tag)

X - remove

dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_element(dataset, element)

Delete the element from the dataset. If VR’s element is a date, then it will be replaced by 00010101

dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_empty(dataset, tag)

X/Z - X unless Z is required to maintain IOD conformance (Type 3 versus Type 2)

dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_empty_or_replace(dataset, tag)

X/Z/D - X unless Z or D is required to maintain IOD conformance (Type 3 versus Type 2 versus Type 1)

dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_empty_or_replace_UID(dataset, tag)

X/Z/U* - X unless Z or replacement of contained instance UIDs (U) is required to maintain IOD conformance (Type 3 versus Type 2 versus Type 1 sequences containing UID references)

dicom_pseudonymizer.utils.simple_dicomanonymizer.delete_or_replace(dataset, tag)

X/D - X unless D is required to maintain IOD conformance (Type 3 versus Type 1)

dicom_pseudonymizer.utils.simple_dicomanonymizer.empty(dataset, tag)

Z - replace with a zero length value, or a non-zero length value that may be a dummy value and consistent with the VR

dicom_pseudonymizer.utils.simple_dicomanonymizer.empty_element(element)

Clean element according to the element’s VR: - SH, PN, UI, LO, CS: value will be set to ‘’ - DA: value will be replaced by ‘00010101’ - TM: value will be replaced by ‘000000.00’ - UL: value will be replaced by 0 - SQ: all subelement will be called with “empty_element”

dicom_pseudonymizer.utils.simple_dicomanonymizer.empty_or_replace(dataset, tag)

Z/D - Z unless D is required to maintain IOD conformance (Type 2 versus Type 1)

dicom_pseudonymizer.utils.simple_dicomanonymizer.generate_actions(tag_list: list, action, options: Optional[dict] = None) dict

Generate a dictionary using list values as tag and assign the same value to all

Parameters
tag_listlist

List of tags which will have the same associated actions

actionfunction

Define the action that will be use. It can be a callable custom function or a name of a pre-defined action from simpledicomanonymizer

optionsdict

Define options tht will be affected to the action (like regexp)

Returns
d: dict

The generated dictionary with the action to be applied.

dicom_pseudonymizer.utils.simple_dicomanonymizer.get_private_tag(dataset, tag)

Get the creator and element from tag

Parameters
datasetFileDataset object of pydicom.dataset module

Dicom dataset

tagtuple

Tag from which we want to extract private information

Returns
ddict

Dictionary with creator of the tag and tag element (which contains element + offset)

dicom_pseudonymizer.utils.simple_dicomanonymizer.get_private_tags(anonymization_actions: dict, dataset: pydicom.dataset.Dataset) List[dict]

Extract private tag as a list of object with creator and element

Parameters
anonymization_actionsdict

List of tags associated to an action.

datasetFileDataset object of pydicom.dataset module

Dicom dataset which will be anonymize and contains all private tags

Returns
dArray of object

Array with private tags

dicom_pseudonymizer.utils.simple_dicomanonymizer.initialize_actions() dict

Initialize anonymization actions with DICOM standard values

Parameters
None.
Returns
d: dict

Dict object which map actions to tags

dicom_pseudonymizer.utils.simple_dicomanonymizer.keep(dataset, tag)

K - keep (unchanged for non-sequence attributes, cleaned for sequences)

dicom_pseudonymizer.utils.simple_dicomanonymizer.regexp(options: dict)

Apply a regexp method to the dataset

Parameters
optionsdict
Contains two values:
  • find: which string should be found

  • replace: string that will replace the found string

Returns
sstring

The string with the regexp applied.

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace(dataset, tag)

D - replace with a non-zero length value that may be a dummy value and consistent with the VR

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_UID(dataset, tag)

U - replace with a non-zero length UID that is internally consistent within a set of Instances Lazy solution : Replace with empty string

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_and_keep_correspondence(dataset, tag)

P - addition to pseudonimize the code and keep a lookup table. If used, it should be called when tag (0x0010, 0x0020) (PatientID) is encountered. It also replaces implicitly the tag (0x0008, 0x0050) (AccessionNumber). A lookup table (csv file) is create with columns: ‘old_patient_id’, ‘new_patient_id’, ‘old_accession_number’, ‘new_accession_number’

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element(element)

Replace element’s value according to it’s VR: - DA: cf replace_element_date - TM: replace with ‘000000.00’ - LO, SH, PN, CS: replace with ‘Anonymized’ - UI: cf replace_element_UID - IS: replace with ‘0’ - FD, FL, SS, US: replace with 0 - ST: replace with ‘’ - SQ: call replace_element for all sub elements - DT: cf replace_element_date_time

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element_UID(element)

Keep char value but replace char number with random number The replaced value is kept in a dictionary link to the initial element.value in order to automatically apply the same replaced value if we have an other UID with the same value

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element_date(element)

Replace date element’s value with ‘00010101’

dicom_pseudonymizer.utils.simple_dicomanonymizer.replace_element_date_time(element)

Replace date time element’s value with ‘00010101010101.000000+0000’

federated_learning

client

federated_learning.client.client.as_tensor(data, dtype=None, device=None) Tensor

Convert the data into a torch.Tensor. If the data is already a Tensor with the same dtype and device, no copy will be performed, otherwise a new Tensor will be returned with computational graph retained if data Tensor has requires_grad=True. Similarly, if the data is an ndarray of the corresponding dtype and the device is the cpu, no copy will be performed.

Args:
data (array_like): Initial data for the tensor. Can be a list, tuple,

NumPy ndarray, scalar, and other types.

dtype (torch.dtype, optional): the desired data type of returned tensor.

Default: if None, infers data type from data.

device (torch.device, optional): the desired device of returned tensor.

Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

Example:

>>> a = numpy.array([1, 2, 3])
>>> t = torch.as_tensor(a)
>>> t
tensor([ 1,  2,  3])
>>> t[0] = -1
>>> a
array([-1,  2,  3])

>>> a = numpy.array([1, 2, 3])
>>> t = torch.as_tensor(a, device=torch.device('cuda'))
>>> t
tensor([ 1,  2,  3])
>>> t[0] = -1
>>> a
array([1,  2,  3])
federated_learning.client.client.main(arch: str <Architecture to use> = 'resnet18', lr: int <Learning Rate> = 0.0006000000000000001, epochs: int <number of epochs for training> = 5, bs: int <Batch size to use> = 64, device: str <Which device to use> = 'cuda:0', port: int <The port used for federated learning> = 8080, apply_dp: <Learning rate> = True, alphas: range <Alphas> = range(2, 32), noise_multiplier: int <Noise injected in DP> = 0.5, max_grad_norm: int <Maximum Gradient Norm when clipping> = 1.0, delta: int <Delta> = 1e-05, matrix_path: str <Pass a value to save the  confusion matrix> = None, csv_path: str <Pass a value to store the logs in csv> = None, roc_path: str <Pass a value to store the ROC-AUC curve> = None, data_path: str <datapath to use> = '../../../../Hospitals/H0', seed: int <Pass a value to set seed> = 42)

dsail.differential_privacy

class federated_learning.client.dsail.differential_privacy.DPCallback(alphas, noise_multiplier, max_grad_norm, delta, device)

Bases: fastai.callback.core.Callback

after_epoch()
before_step()

dsail.federated_learning

class federated_learning.client.dsail.federated_learning.FLClient(learn, lr, ep, apply_dp, alphas, noise_multiplier, max_grad_norm, delta, device, csv_path, data_path, matrix_path, roc_path)

Bases: object

evaluate(parameters, config)
fit(parameters, config)
get_parameters()
set_parameters(parameters)

dsail.utils

class federated_learning.client.dsail.utils.ImbalancedDatasetSampler(dataset, indices: Optional[list] = None, num_samples: Optional[int] = None, callback_get_label: Optional[Callable] = None)

Bases: Generic[torch.utils.data.sampler.T_co]

Samples elements randomly from a given list of indices for imbalanced dataset

Parameters
indices: list

a list of indices

num_samples: int

number of samples to draw

callback_get_label: Callable

a callback-like function which takes two arguments - dataset and index

federated_learning.client.dsail.utils.get_imbalance_weights(ds)
federated_learning.client.dsail.utils.save_matrix(learn, path)
federated_learning.client.dsail.utils.save_roc(learn, path)
federated_learning.client.dsail.utils.set_seed(dls, seed)

server

class federated_learning.server.server.SaveModelStrategy(*args, **kwargs)

Bases: flwr.server.strategy.fedavg.FedAvg

aggregate_fit(rnd: int, results, failures)

Aggregate fit results using weighted average.

federated_learning.server.server.main(fraction_fit: float <The fraction of available client used for training> = 1.0, min_fit_clients: int <The minimum number of clients used to start training> = 3, min_available_clients: int <The minimum number of clients used to start server> = 3, min_eval_clients: int <The minimum number of clients used for evaluation> = 3, num_rounds: int <The number of rounds of training> = 3, save_path: str <Set a path to save weights> = None)