Skip to Main Content (Press Enter)

Logo UNIPV
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture

UNIFIND
Logo UNIPV

|

UNIFIND

unipv.it
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  1. Pubblicazioni

Random sampling of the Protein Data Bank: RaSPDB

Articolo
Data di Pubblicazione:
2021
Abstract:
A novel and simple procedure (RaSPDB) for Protein Data Bank mining is described. 10 PDB subsets, each containing 7000 randomly selected protein chains, are built and used to make 10 estimations of the average value of a generic feature F-the length of the protein chain, the amino acid composition, the crystallographic resolution, and the secondary structure composition. These 10 estimations are then used to compute an average estimation of F together with its standard error. It is heuristically verified that the dimension of these 10 subsets-7000 protein chains-is sufficiently small to avoid redundancy within each subset and sufficiently large to guarantee stable estimations amongst different subsets. RaSPDB has two major advantages over classical procedures aimed to build a single, non-redundant PDB subset: a larger fraction of the information stored in the PDB is used and an estimation of the standard error of F is possible.
Tipologia CRIS:
1.1 Articolo in rivista
Keywords:
Biochemistry, Computational biology and bioinformatics, Molecular biology, Structural biology
Elenco autori:
Carugo, Oliviero
Autori di Ateneo:
CARUGO OLIVIERO ITALO
Link alla scheda completa:
https://iris.unipv.it/handle/11571/1468538
Pubblicato in:
SCIENTIFIC REPORTS
Journal
  • Dati Generali

Dati Generali

URL

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8683422/
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 25.6.0.0