I have a re.findall() searching for a pattern in python, but it returns some undesired results and I want to know how to exclude them. The text is below, I want to get the names, and my statement (re.findall(r'([A-Z]{4,} \w. \w*|[A-Z]{4,} \w*)', text)
) is returning this:
'ERIN E. SCHNEIDER',
'MONIQUE C. WINKLER',
'JASON M. HABERMEYER',
'MARC D. KATZ',
'JESSICA W. CHAN',
'RAHUL KOLHATKAR',
'TSPU or taken',
'TSPU or the',
'TSPU only',
'TSPU was',
'TSPU and']
I want to get rid of the "TSPU" pattern items. Does anyone know how to do it?
JINA L. CHOI (NY Bar No. 2699718)
ERIN E. SCHNEIDER (Cal. Bar No. 216114) [email protected]
MONIQUE C. WINKLER (Cal. Bar No. 213031) [email protected]
JASON M. HABERMEYER (Cal. Bar No. 226607) [email protected]
MARC D. KATZ (Cal. Bar No. 189534) [email protected]
JESSICA W. CHAN (Cal. Bar No. 247669) [email protected]
RAHUL KOLHATKAR (Cal. Bar No. 261781) [email protected]
- The Investor Solicitation Process Generally Included a Face-to-Face Meeting, a Technology Demonstration, and a Binder of Materials [...]