博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
用正则表达式匹配用rdf3x处理过后的TTL格式文档
阅读量:6997 次
发布时间:2019-06-27

本文共 4318 字,大约阅读时间需要 14 分钟。

1、比如下面这个用rdf3x处理过后的TTL文档片段:

注意缩进的是两个空格

.
;
"CHEMBL_BS_2659";
"CHEMBL_BS_2659";
;
"30S ribosomal protein S1".
,
.
;
"CHEMBL_BS_2623";
"CHEMBL_BS_2623";
;
"16S/23S ribosomal RNA interface".
;
"CHEMBL_BS_2624";
"CHEMBL_BS_2624";
;
"23S ribosomal RNA".
.

2、Java编写的正则表达式代码

代码里注释的部分和上面那行是输出三种所需的不同结果

package com.jena;import java.io.BufferedReader;import java.io.FileReader;import java.util.regex.Matcher;import java.util.regex.Pattern;public class rdfReader3 {    static String url="";        public static void main(String[] args) {        FileReader fr=null;        BufferedReader br=null;        try{            fr=new FileReader("C:/Users/Don/workspace/Jena/src/com/jena/bindingsite");            br=new BufferedReader(fr);            String s=" ";            StringBuffer str=new StringBuffer();            while((s=br.readLine())!=null){                Pattern p= Pattern.compile("<([^<>]*)>");    //匹配所有尖括号里的内容//                Pattern p= Pattern.compile("^\n*<([^<>]*)>");    //匹配每一个主语,开头匹配“除了空格所有字符”,后面匹配"<>里的所有内容,内容为非尖括号"//                Pattern p= Pattern.compile("  <([^<>]*)>");        //匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"                Matcher m=p.matcher(s);                              while(m.find()){                    System.out.println(m.group(1));                }            }                    }catch(Exception e){            System.out.println(e.getMessage());        }                    }        }

(1)匹配所有尖括号里的内容

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853http://rdf.ebi.ac.uk/terms/chembl#hasBindingSitehttp://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659http://www.w3.org/1999/02/22-rdf-syntax-ns#typehttp://rdf.ebi.ac.uk/terms/chembl#BindingSitehttp://www.w3.org/2000/01/rdf-schema#labelhttp://rdf.ebi.ac.uk/terms/chembl#chemblIdhttp://rdf.ebi.ac.uk/terms/chembl#hasTargethttp://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965http://rdf.ebi.ac.uk/terms/chembl#bindingSiteNamehttp://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965http://rdf.ebi.ac.uk/terms/chembl#hasBindingSitehttp://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623http://www.w3.org/1999/02/22-rdf-syntax-ns#typehttp://rdf.ebi.ac.uk/terms/chembl#BindingSitehttp://www.w3.org/2000/01/rdf-schema#labelhttp://rdf.ebi.ac.uk/terms/chembl#chemblIdhttp://rdf.ebi.ac.uk/terms/chembl#hasTargethttp://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965http://rdf.ebi.ac.uk/terms/chembl#bindingSiteNamehttp://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624http://www.w3.org/1999/02/22-rdf-syntax-ns#typehttp://rdf.ebi.ac.uk/terms/chembl#BindingSitehttp://www.w3.org/2000/01/rdf-schema#labelhttp://rdf.ebi.ac.uk/terms/chembl#chemblIdhttp://rdf.ebi.ac.uk/terms/chembl#hasTargethttp://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022http://rdf.ebi.ac.uk/terms/chembl#bindingSiteNamehttp://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022http://rdf.ebi.ac.uk/terms/chembl#hasBindingSitehttp://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624

(2)匹配每一个主语,即开头不是两个空格的那一行数据的第一对尖括号里的内容

 

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022

 

 

(3)匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"

 

http://www.w3.org/2000/01/rdf-schema#labelhttp://rdf.ebi.ac.uk/terms/chembl#chemblIdhttp://rdf.ebi.ac.uk/terms/chembl#hasTargethttp://rdf.ebi.ac.uk/terms/chembl#bindingSiteNamehttp://www.w3.org/2000/01/rdf-schema#labelhttp://rdf.ebi.ac.uk/terms/chembl#chemblIdhttp://rdf.ebi.ac.uk/terms/chembl#hasTargethttp://rdf.ebi.ac.uk/terms/chembl#bindingSiteNamehttp://www.w3.org/2000/01/rdf-schema#labelhttp://rdf.ebi.ac.uk/terms/chembl#chemblIdhttp://rdf.ebi.ac.uk/terms/chembl#hasTargethttp://rdf.ebi.ac.uk/terms/chembl#bindingSiteName

 

 匹配前面两个空格开始的数据时,在前面直接输入两个空格即可

Pattern p= Pattern.compile("  <([^<>]*)>");

 

转载地址:http://opsvl.baihongyu.com/

你可能感兴趣的文章