DIAMIN is a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks.
It runs over Apache Spark (>=2.3,https://spark.apache.org/) and requires a Java compliant virtual machine (>= 1.8). The software is released as a single executable jar file, diamin-1.0.0-all.jar, that can be used to process large molecular interaction networks in a Local Mode or in a Cluster Mode.
With Local Mode, DIAMIN employs multiple CPU cores of a single machine. The Local mode requires a proper installation of the Java Development Kit (JDK):
The jar file enables the usage of the DIAMIN library from the command line by specifying:
- the [input_parameters], that is the list of the parameters required by the IOmanager class to import the interaction network;
- the function_name, that is a list of functions provided by the DIAMIN library;
- the [function_parameters], that is the list of the parameters required by the chosen function.
java -jar diamin-1.0.0-all.jar [_input_parameters_] function_name [input_parameters]
The Manual.pdf lists and describes in depth all the functions provided by the DIAMIN library. In the following, the command lines to run the examples discussed in the reference paper.
In a Molecular Interaction Network pivotal interactors are likely to be represented by highly connected nodes (i.e., hubs). The degrees function of the DIAMIN library allows the user to extract a subset of interactors, according to the value of their degree. This function computes the degrees of each interactor and it returns all those elements satisfying a given condition. Use the following syntax to compute the interactors of the HomoSapiens_intact_network associated with the 20 largest degrees:
java -jar diamin-1.0.0-all.jar LOCAL human_intact_network.txt degree 20
Use the following syntax to compute the interactors of the HomoSapiens_string_network associated with the 20 largest degrees:
java -jar diamin-1.0.0-all.jar LOCAL human_string_network.txt degree 20
The function with name xWeightedNeighbors returns the x-weighted-Neighborhood of an input node. Use the following syntax to compute the x-weighted_neighborhood of the protein TP53 (uniprotkb:P04637) w.r.t. the Intact reliability scores for x=0.75:
java -jar diamin-1.0.0-all.jar LOCAL human_intact_network.txt xWeightedNeighbors uniprotkb:P04637,0.75
Use the following syntax to compute the x-weighted_neighborhood of the protein TP53 (uniprotkb:P04637) w.r.t. the Intact reliability scores for x=0.80:
java -jar diamin-1.0.0-all.jar LOCAL human_intact_network.txt xWeightedNeighbors uniprotkb:P04637,0.80
The Cluster Mode allows to exploit the resources of a computer cluster. Assuming both Apache Spark and Java are properly installed, the following syntax allows to perform the degrees computation discussed in the Example 2 on a computer cluster:
spark-submit diamin-1.0.0-all.jar CLUSTER human_intact_network.txt degree 20
Nowadays, many Cloud Providers enhance a quick and easy creation of a Spark Cluster:
Moreover, we refer the interested reader to the following link for a quick guide about the installation of Apache Spark on a free EC2 AWS instance: https://dzone.com/articles/apache-spark-setting-up-a-cluster-on-aws.
DIAMIN also allows user-driven analysis of MIN. In this case, users have to combine the provided function on a Java IDE in order to implement new algorithms that solve both standard and more specific problems in network analysis.
IntelliJ IDEA is one of the most famous integrated development environment (IDE) for developing computer software written in Java. In the following, we provide a step-to-step tutorial for the computation of the Kleinberg dispersion measure by combining DIAMIN function on IntelliJ IDEA:
-
Install IntelliJ IDEA.
-
Above the list of files of the DIAMIN repository, click on Code button and download the source code.
-
Import the DIAMIN directory in IntelliJ IDEA as a sbt project.
-
Combine DIAMIN classes and functions. Write you algorithm in the Main class, starting from line x.
-
Run your algorithm on local mode.
spark-submit diamin-1.0.0-all.jar LOCAL human_intact_network.txt my_algorithm 20
- Run your algorithm on local mode.
spark-submit diamin-1.0.0-all.jar CLUSTER human_intact_network.txt my_algorithm 20
The DIAMIN library was tested by using the following protein-to-protein network: