# Regular expressions_Extracting Necessary Information from Random Data

Task: Extract important information from random data using Regular Expressions.

1. Extract Email ID from data = ' From FacePrep@focusacademy.in on Saturday morning Jan 26, important mail'.

Solution:

`import redata = ' From FacePrep@focusacademy.in on Saturday morning Jan 26, important mail'y = re.findall('\S + @\S+', data)         # \S = atleast 1 non space character, '@' find this, + is to extract one or more sequences # we extract from one space to another space with a word containing '@' in it. Output:y = ['FacePrep@focusacademy.in'] `

In the above code, it is important to note that we used the greedy algorithm to get the maximum number of matches. If we had entered '?' and turned the above code into a non-greedy program the output would have been y = ['p@f'] because it considers as minimum number of matches as possible.

2. Extract email id from a file with multiple lines.

`import rehandle = open('filename.txt')for line in handle:                y = re.findall('^From(\S+@\S+)' , line)        # '^' represents starting from "From" find a string from one space to another with '@' in it.`

3. Extract only the domain of the email.

`import redata = ' From FacePrep@focusacademy.in on Saturday morning Jan 26, important mail'y = re.findall('@([^ ]*', data)         # Start from '@' symbol and search till next space, one or more characters. A space is intentionally left blank inside [ ] to instruct the program to search till the next spaceOutput:y = ['focusacademy.in']`

4. Extract only the username in the email id.

`import redata = ' From FacePrep@focusacademy.in on Saturday morning Jan 26, important mail'y = re.findall('^ From .*@([^ ]*)', data)         # From space to @ symbolOutput:y = ['FacePrep'] `

### Relevant exercises

POST A NEW COMMENT

• Input (stdin)

Output (stdout)

Input (stdin)

Expected Output

Compiler Message

Input (stdin)

`2    3`

`5`
`5`
`5`