Skip to content
YC.S edited this page Sep 26, 2019 · 32 revisions

(1) spipy.image.radp

+ :numpy.1darray = radial_profile_2d (data:numpy.2darray, center:(float, float), mask:numpy.2darray/None)

+ :numpy.1darray = radial_profile_3d (data:numpy.3darray, center:(float, float, float), mask:numpy.3darray/None)

+ :numpy.2darray = shells_2d (rads:[float, ...], data_shape:(int, int), center:(float, float))

+ :numpy.2darray = shells_3d (rads:[float, ...], data_shape:(int, int, int), center:(float, float, float))

+ :numpy.2darray = radp_norm_2d (ref_Iq:numpy.1darray, data:numpy.2darray, center:(float, float), mask:numpy.2darray/None)

+ :numpy.3darray = radp_norm_3d (ref_Iq:numpy.1darray, data:numpy.3darray, center:(float, float, float), mask:numpy.2darray/None)

+ :numpy.2darray = circle (data_shape:int, rad:float)

--

  • radial_profile_2d : return intensity radial profile of a 2d pattern, shape=(Nr,3), Nr is the pixel length at the pattern corner. The three columns are r(pixels), Iq and std-error.
    • data : input pattern, shape=(Nx,Ny)
    • center : zero frequency point of the pattern, unit=pixels
    • mask : 0/1 binary pattern, shape=(Nx, Ny), 1 means masked area, 0 means useful area, default=None

--

  • radial_profile_3d : return intensity radial profile of a 3d field, shape=(Nr,3), Nr is the pixel length at the volume corner. The three columns are r(pixels), Iq and std-error.
    • data : input pattern, shape=(Nx,Ny,Nz)
    • center : zero frequency point of the scattering volume, unit=pixels
    • mask : 0/1 binary pattern, shape=(Nx, Ny, Nz), 1 means masked area, 0 means useful area, default=None

--

  • shells_2d : return xy indices of a pattern which forms a shell/circle at radius=rads, shape=[ shell1( numpy.array( x[..], y[..] ), shape=(N1,2) ) , shell2 , ... , shelln ]
    • rads : radius of output shell, list, [ r1, r2, ..., rn ], unit=pixels
    • data_shape : size of your pattern, (size_x, size_y), unit=pixels
    • center : zero frequency point of the pattern, unit=pixels

--

  • shells_3d : return xyz indices of a field which forms a spherical shell/surface at radius=rads, shape=[ shell1( numpy.array( x[..], y[..], z[..] ), shape=(N1,3) ) , shell2 , ... , shelln ]
    • rads : radius of output shells, list, [ r1, r2, ..., rn ], unit=pixels
    • data_shape : size of your volume, (size_x, size_y, size_z), unit=pixels
    • center : zero frequency point of the scattering volume, unit=pixels

--

  • radp_norm_2d : normalize pattern intensities by comparing intensity radial profile, return normalized pattern
    • ref_Iq : reference intensity radial profile, shape=(Nr,)
    • data : pattern that need to be normalized, shape=(Nx,Ny)
    • center : zero frequency point of the pattern, unit=pixels
    • mask : 0/1 binary pattern, shape=(Nx, Ny), 1 means masked area, 0 means useful area, default=None
    • [Notice] The program do not require accurate match of shapes between 'ref_Iq' and 'data' , but the start point of 'ref_Iq' should be exactly the center of 'data'

--

  • radp_norm_3d : normalize volume intensities by comparing intensity radial profile, return normalized pattern
    • ref_Iq : reference intensity radial profile, shape=(Nr,)
    • data : volume that need to be normalized, shape=(Nx,Ny,Nz)
    • center : zero frequency point of the scattering volume, unit=pixels
    • mask : 0/1 binary pattern, shape=(Nx, Ny, Nz), 1 means masked area, 0 means useful area, default=None
    • [Notice] The program do not require accurate match of sizes between 'ref_Iq' and 'data' , but the start point of 'ref_Iq' should be exactly the center of 'data'

--

  • circle : generate a circle/sphere solid area with given radius, centered by the origin of coordinates. Return points' index inside this area, shape=(Np,data_shape)
    • data_shape : output dimension, int, 2 (circle) or 3 (sphere)
    • rad : float, radius of generated area, in pixels
    • [NOTICE] : The diameter of output circle is rad*2+1, set rad<0 if you want None output

(2) spipy.image.quat

  • In this modules, quaternion's format is numpy.array( [w, qx, qy, qz] ), where w = cos(theta/2) , q? = ? * sin(theta/2).
+ :numpy.1darray = invq (q:numpy.1darray)

+ :numpy.1darray = quat_mul (q1:numpy.1darray, q2:numpy.1darray)

+ :numpy.1darray = conj (q:numpy.1darray)

+ :numpy.2darray = quat2rot (q:numpy.1darray)

+ :numpy.1darray = rot2quat (rot:numpy.2darray)

+ :numpy.1darray = quat2azi (q:numpy.1darray)

+ :numpy.1darray = azi2quat (azi:numpy.1darray)

+ :numpy.1darray = rotv (vector:np.1darray, q:np.1darray)

+ :numpy.1darray = Slerp (q1:numpy.1darray, q2:numpy.1darray, t=float)

--

  • invq : calculate the inverse/reciprocal of a quaternion, return q-1

    • q : input quaterion (w, qx, qy, qz)
  • quat_mul : multiply two quaternions, return q1 * q2

  • conj : conjugate quaternion, return q*

--

  • quat2rot : transfer quaternion to 3D rotation matrix, return 3x3 matrix

  • rot2quat : transfer 3D rotation matrix to quaternion, return quaternion

--

  • quat2azi : transfer quaternion to azimuth angle, return numpy.array([theta,x,y,z])

  • azi2quat : transfer azimuth angle to quaternion, return numpy.array([w,qx,qy,qz])

--

  • rotv : rotate a 3D vector using a quaternion, return new vector nump.array( [x', y', z'] )
    • vector : input vector to rotate, numpy.array( [x, y, z] )
    • q : quaternion

--

  • Slerp : linear interpolation on spherical surface between two quaternions, return new quaternion
    • q1 & q2 : two input quaternions
    • t : interpolation weight from q1 to q2, 0~1

(3) spipy.image.classify

+ :[numpy.2darray, numpy.1darray] = cluster_fSpec (dataset:numpy.3darray, #mask:numpy.2darray/None, low_filter:float/0.3, decomposition:str/'SVD', ncomponent:int/2, nneighbors:int/10, LLEmethod:str/'Standard', clustering:int/2, njobs:int/1, verbose:bool/True)

+ :[numpy.2darray, numpy.1darray] = cluster_fTSNE (dataset:numpy.3darray, mask:numpy.2darray/None, low_filter:float/0.3, no_dims:int/2, perplexity:int/50, use_pca:bool/True, initial_dims:int/50, max_iter:int/500, theta:float:0.5, randseed:int/-1, clustering:int/2, njobs:int/1, verbose:bool/False)

--

  • cluster_fSpec : single-nonsingle hits clustering using linear/non-linear decomposition and spectural clustering, return [ data_decomp{ numpy.array, shape=(Nd, ncomponent) }, label{ numpy.array, shape=(Nd, ) }], where 'data_decomp' is the data after decomposition and 'label' is predicted label of clustering methods.
    • dataset : raw dataset, int/float, shape=(Nd, Nx, Ny)
    • mask : mask file, a binary (0/1) 2d numpy array where 1 means masked area, shape=(Nx, Ny)
    • low_filter : float 0~1, the percent of area at the center part of fft intensity pattern that is used for clustering, default=0.3
    • decomposition : decoposition method, choosen from 'LLE' (Locally Linear Embedding), 'SVD' (Truncated Singular Value Decomposition) and 'SpecEM' (Spectral Embedding), default='SVD'
    • ncomponent : number of components left after decomposition, default=2
    • nneighbors : number of neighbors to consider for each point, considered only when decomposition method is 'LLE', default=10
    • LLEmethod : LLE method, choosen from 'standard' (standard locally linear embedding algorithm), 'modified' (modified locally linear embedding algorithm), 'hessian' (Hessian eigenmap method) and 'ltsa' (local tangent space alignment algorithm), default='standard'
    • clustering : int, whether to do clustering (<0 or >0) and how many classes (value of this param) to have, default=2
    • njobs : int, number of jobs, default=1
    • verbose: bool, default=True
    • [NOTICE] The input dataset is not recommended to contain more than 1k patterns, but it's also necessary to have more than 50 ones. You can split the original dataset into several parts and use multi-processors to deal with them.

--

  • cluster_fTSNE : single-nonsingle hits clustering using t-SNE decomposition and KNN clustering, return [ data_decomp{ numpy.array, shape=(Nd, ncomponent) }, label{ numpy.array, shape=(Nd, ) }], where 'data_decomp' is the data after decomposition and 'label' is predicted label of clustering methods.
    • dataset : raw dataset, int/float, shape=(Nd, Nx, Ny)
    • mask : mask file, a binary (0/1) 2d numpy array where 1 means masked area, shape=(Nx, Ny)
    • low_filter : float 0~1, the percent of area at the center part of fft intensity pattern that is used for clustering, default=0.3
    • no_dims : number of components left after decomposition, default=2
    • perplexity : perlexity value to evaluate P(i|j) in t-SNE, POSITIVE, and usually has positive correlation with dataset amount, no need to be higher than 50,default=50
    • use_pca : whether to use PCA to generate initiate features, default=True
    • initial_dims : output dimensions of inititate PCA, POSITIVE, ignored if use_pca=False, default=50
    • max_iter : max times of iterations, default=1000, suggested >500
    • theta : the speed vs accuracy trade-off parameter, theta=1 means highest speed with lowest accuracy, default=0.5
    • randseed : 1) if it is >=0, then use it as initiate value's generating seed; 2) else <0 then use current time as random seed, default=-1
    • clustering : int, whether to do clustering (<0 or >0) and how many classes (value of this param) to have, default=2
    • njobs : int, number of threads in parallel, default=1
    • verbose : print details while running ? default=True
    • [NOTICE] The input dataset is not recommended to contain more than 5k patterns, but it's also neccessary to have more than 500 ones. You can split the original dataset into several parts and use multi-processors to deal with them.

(4) spipy.image.preprocess

+ :numpy.3darray = fix_artifact (dataset:numpy.3darray, estimated_center:(int, int), artifacts:numpy.2darray, mask:numpy.2darray/None)

+ :numpy.3darray = fix_artifact_auto (dataset:numpy.3darray, estimated_center:(int, int), njobs:int/1, mask:numpy.2darray/None, vol_of_bins:int/100)

+ :float or [float, numpy.3darray] = adu2photon (dataset:numpy.3darray, mask:numpy.2darray/None, photon_percent:float/0.9, nproc:int/2, transfer:bool/True, force_poisson:bool/False)

+ :numpy.1darray = hit_find (dataset:numpy.3darray, background:numpy.2darray, radii_range:list, mask:numpy.2darray/None, cut_off:float/None)

+ :numpy.1darray = hit_find_pearson (dataset:numpy.3darray, background:numpy.2darray, radii_range:list, mask:numpy.2darray/None, max_cc:float/0.5)

--

  • fix_artifact : reduces artifacts of dataset (adu values), patterns in the dataset should share the same artifacts, to save RAM, your input dataset is modified directly and return
    • dataset : FLOAT adu patterns, shape=(Nd,Nx,Ny)
    • estimated_center : estimated pattern center in pixels, (Cx,Cy)
    • artifacts : where artifacts locate in pattern (indices), shape=(Na,2), the first colum is x coordinates of artifacts and second colum is y coordinates
    • mask : mask area of patterns, a 0/1 binary array where 1 means masked, shape=(Nx,Ny), default=None
    • [NOTICE] This function cannot reduce backgroud noise, try preprocess.adu2photon instead

--

  • fix_artifact_auto : fix artifacts without providing the position of artifacts, data with similar intensity and pattern will be grouped together and use averaged radial profile to analysis artifacts at every pixel. To save RAM, your input dataset is modified directly and return
    • dataset : FLOAT adu patterns, shape=(Nd,Nx,Ny)
    • estimated_center : estimated pattern center in pixels, (Cx,Cy)
    • njobs : number of processes to run in parallel, default=1
    • mask : mask area of patterns, a 0/1 binary array where 1 means masked, shape=(Nx,Ny), default=None
    • vol_of_bins : the number of similar patterns that will be processed together in a group, default=100
    • [NOTICE] vol_of_bins is suggested to be 100~200 and the whole dataset is suggested to contain >1k patterns. This method needs classification of single-particle patterns in advance

--

  • adu2photon : evaluate adu value per photon and transfer adu patterns to photon patterns, return adu : float or [adu : float, data_photonCount : numpy.ndarray, shape=(Nd,Nx,Ny)]
    • dataset : patterns of adu values, shape=(Nd, Nx, Ny)
    • mask : a 0/1 2d array (1 means masked pixel), shape=(Nx,Ny), default=None
    • photon_percent : estimated percent of pixels that has photons, default=0.1
    • nproc : number of processes running in parallel, default=2
    • transfer : Ture -> evaluate adu unit and transfer to photon, False -> just evaluate, default=True
    • force_poisson : whether to determine photon numbers at each pixel according to poisson distribution, default=False, ignored if transfer=False

--

  • hit_find : hit finding based on chi-square score, high score means hit. Return label : 0/1 numpy.array, the same order with input dataset and 1 means it is a hit.
    • dataset : raw patterns for intput, numpy.ndarray, shape=(Nd,Nx,Ny)
    • background : dark pattern for calibration, numpy.ndarray, shape=(Nx,Ny)
    • radii_range : radii of annular area used for hit-finding, list/array, [center_x, center_y, inner_r, outer_r], unit=pixel, default=None. (You could give center_x,center_y=None and it will search for pattern center automatically)
    • mask : mask area of patterns, 0/1 numpy.ndarray where 1 means masked, shape=(Nx,Ny), default=None
    • cut_off : chi-square cut-off, positve int/float, default=None and a mix-gaussian analysis is used for clustering. Usually a number >= 10 is good.
    • [NOTICE] : if use cut_off=None, it's better for input dataset to contain over 100 patterns

--

  • hit_find_pearson : hit finding based on pearson cc score between pattern and background, low score means hit. Return label : 0/1 numpy.array, the same order with input dataset and 1 means a hit
    • dataset : raw patterns for intput, numpy.ndarray, shape=(Nd,Nx,Ny)
    • background : dark pattern for calibration, numpy.ndarray, shape=(Nx,Ny)
    • radii_range : radii of annular area used for hit-finding, list/array, [center_x, center_y, inner_r, outer_r], unit=pixel, default=None. (You could give center_x,center_y=None and it will search for pattern center automatically)
    • mask : mask area of patterns, 0/1 numpy.ndarray where 1 means masked, shape=(Nx,Ny), default=None
    • max_cc : max cc for patterns to be identified as hits, -1~1 float, lower means more strict, default=0.5

(5) spipy.image.io

+ :{'volume':numpy.3darray, 'header':object} = readccp4 (file_path:str)

+ :void = writeccp4 (volume:numpy.3darray, save_file:str)

+ :numpy.3darray = pdb2density (pdb_file:str, resolution:float)

+ :void = cxi_parser (cxifile:str, out:str)

+ :void = xyz2pdb (xyz_array:numpy.2darray, atom_type:array or list, b_factor:array or list/None, save_file:str/"convert.pdb")

- :numpy.2darray = _readpdb (pdb_file:str)

+ :list = readpdb_full (pdb_file:str)

--

  • readccp4 : read ccp4/mrc files and return information, return a dict, {'volume':indensity matrix, 'header':header information}
    • file_path : path of ccp4/mrc file to read

--

  • writeccp4 : write given intensity volume into a ccp4/mrc file, NO RETURN
    • volume : the 3D intensity volume, a numpy array
    • save_file : specify the path+name of ccp4/mrc file that you want to save to, for example save_file='./test.ccp4'

--

  • pdb2density : read pdb file and transfer it to electron density map, return the electron density map, a numpy 3D-array
    • pdb_file : path of pdb file to read
    • resolution : the resolution of density map, in angstrom

--

  • cxi_parser : print cxi/h5 inner path structures, no return
    • cxifile : str, cxi file path
    • out : str, give 'std' for terminal print or give a file path to redirect to that file

--

  • xyz2pdb : write 3D xyz-coordinates to a pdb file, no return
    • xyz_array : numpy.2darray, shape=(Np,3), columns from the 1st to 3rd is x,y,z coordinates
    • atom_type : list, which atoms would you like to write in the file. If there is only one item, then all atoms are the same; otherwise you should give a list containing the same number of atom types with the xyz_length. For example, you can either give ['C'] or ['C','H','H','O','H'] for a 5-atom pdb file. No matter upper or lower case.
    • b_factor : array or list, b-factors in pdb file, shape=(Np,), default is None then all b-factors=1
    • save_file : str, file path to save, default="./convert.pdb"

--

  • _readpdb : read pdb files, return a numpy.2darray with shape (Np,5). Columns from 1st to 5th are [atom-number, x, y, z, atom-mass]
    • pdb_file : str, the path of pdb file to be read

--

  • readpdb_full : read pdb files and return full infomation, return a dict {res-index:[res-name,{atom-name:[atom-index,atom-mass,x,y,z,occupancy,b-factor]}]}
    • pdb_file : str, the path of pdb file to be read

Clone this wiki locally