Revision:
JSON is a syntax for storing and exchanging data. JSON is text, written with JavaScript object notation.
JSON in Python: Python has a built-in package called "json", which can be used to work with JSON data.
Example 1:mport the json module:
import json
Parse JSON - convert from JSON to Python: if you have a JSON string, you can parse it by using the json.loads() method.The result will be a Python dictionary.
Example 2: convert from JSON to Python:
import json # some JSON: x = '{ "name":"John", "age":30, "city":"New York"}' # parse x: y = json.loads(x) # the result is a Python dictionary: print(y["age"])
Convert from Python to JSON: if you have a Python object, you can convert it into a JSON string by using the json.dumps() method.
Example 3: convert from Python to JSON:
import json # a Python object (dict): x = { "name": "John", "age": 30, "city": "New York" } # convert into JSON: y = json.dumps(x) # the result is a JSON string: print(y) # {"name": "John", "age": 30, "city": "New York"}
You can convert Python objects of the following types, into JSON strings: dict, list, tuple, string, int, float, True, False, None.
Example 4: convert Python objects into JSON strings, and print the values:
import json print(json.dumps({"name": "John", "age": 30})) # {"name": "John", "age": 30} print(json.dumps(["apple", "bananas"])) # ["apple", "bananas"] print(json.dumps(("apple", "bananas"))) # ["apple", "bananas"] print(json.dumps("hello")) # "hello" print(json.dumps(42)) # 42 print(json.dumps(31.76)) # 31.76 print(json.dumps(True)) # true print(json.dumps(False)) # false print(json.dumps(None)) # null
When you convert from Python to JSON, Python objects are converted into the JSON (JavaScript) equivalent:
Python - JSON
dict - Object
list - Array
tuple - Array
str - String
int - Number
float - Number
True - true
False - false
None - null
Format the result: The json.dumps() method has parameters to make it easier to read the result: indent, separators. You can also define the separators, default value is (", ", ": "), which means using a comma and a space to separate each object, and a colon and a space to separate keys from values.
Example 5: use the "indent" parameter to define the numbers of indents:
json.dumps(x, indent=4)
Example 6: use the separators parameter to change the default separator:
json.dumps(x, indent=4, separators=(". ", " = "))
Order the result: The json.dumps() method has parameters to order the keys in the result: sort_keys.
Example 7: use the sort_keys parameter to specify if the result should be sorted or not:
json.dumps(x, indent=4, sort_keys=True)
A RegEx, or regular expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern.
RegEx module: Python has a built-in package called "re", which can be used to work with Regular Expressions.
RegEx in Python: when you have imported the "re" module, you can start using regular expressions:
Example 8: search the string to see if it starts with "The" and ends with "Spain":
import re txt = "The rain in Spain" x = re.search("^The.*Spain$", txt)
RegEx functions: the "re" module offers a set of functions that allows us to search a string for a match:
Function - Description
findall - Returns a list containing all matches
search - Returns a Match object if there is a match anywhere in the string
split - Returns a list where the string has been split at each match
sub - Replaces one or many matches with a string
Metacharacters: metacharacters are characters with a special meaning:
Character - Description - Example
[] - A set of characters - "[a-m]"
\ - Signals a special sequence (can also be used to escape special characters) - "\d"
. - Any character (except newline character) - "he..o"
^ - Starts with -"^hello"
$ - Ends with - "world$"
* - Zero or more occurrences - "aix*"
+ - One or more occurrences - "aix+"
{} - Exactly the specified number of occurrences - "al{2}"
| - Either or - "falls|stays"
() - Capture and group
Special sequences: a special sequence is a "\"" followed by one of the characters in the list below, and has a special meaning:
Character - Description - Example
\A - Returns a match if the specified characters are at the beginning of the string - "\AThe"
\b - Returns a match where the specified characters are at the beginning or at the end of a word
(the "r" in the beginning is making sure that the string is being treated as a "raw string") - r"\bain"
- r"ain\b"
\B - Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word
(the "r" in the beginning is making sure that the string is being treated as a "raw string") - r"\Bain"
- r"ain\B"
\d - Returns a match where the string contains digits (numbers from 0-9) - "\d"
\D - Returns a match where the string DOES NOT contain digits - "\D"
\s - Returns a match where the string contains a white space character - "\s"
\S - Returns a match where the string DOES NOT contain a white space character - "\S"
\w - Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character) - "\w"
\W - Returns a match where the string DOES NOT contain any word characters - "\W"
\Z - Returns a match if the specified characters are at the end of the string - "Spain\Z"
Sets: a set is a set of characters inside a pair of square brackets [] with a special meaning:
Set - Description
[arn] - Returns a match where one of the specified characters (a, r, or n) are present
[a-n] - Returns a match for any lower case character, alphabetically between a and n
[^arn] - Returns a match for any character EXCEPT a, r, and n
[0123] - Returns a match where any of the specified digits (0, 1, 2, or 3) are present
[0-9] - Returns a match for any digit between 0 and 9
[0-5][0-9] - Returns a match for any two-digit numbers from 00 and 59
[a-zA-Z] - Returns a match for any character alphabetically between a and z, lower case OR upper case
[+] - In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for any + character in the string
The findall() function returns a list containing all matches.The list contains the matches in the order they are found. If no matches are found, an empty list is returned:
Example 9: print a list of all matches:
import re #Return a list containing every occurrence of "ai": txt = "The rain in Spain" x = re.findall("ai", txt) y = re.findall("Portugal", txt) print(x) # ['ai', 'ai'] print(y) # []
The search() function searches the string for a match, and returns a "Match object" if there is a match.If there is more than one match, only the first occurrence of the match will be returned. If no matches are found, the value "None" is returned.
Example 10: search for the first white-space character in the string:
import re txt = "The rain in Spain" x = re.search("\s", txt) print("The first white-space character is located in position:", x.start()) # the first white-space character is located in position: 3
The split() function returns a list where the string has been split at each match. You can control the number of occurrences by specifying the maxsplit parameter.
Example 11: split at each white-space character:
import re txt = "The rain in Spain" x = re.split("\s", txt) print(x) # ['The', 'rain', 'in', 'Spain']
Example 12: split the string only at the first occurrence:
import re txt = "The rain in Spain" x = re.split("\s", txt, 1) print(x) # ['The', 'rain in Spain']
The sub() function replaces the matches with the text of your choice. You can control the number of replacements by specifying the "count" parameter:
Example 13: replace every white-space character with the number 9:
import re txt = "The rain in Spain" x = re.sub("\s", "9", txt) print(x) # The9rain9in9Spain
Example 14: replace the first 2 occurrences:
import re txt = "The rain in Spain" x = re.sub("\s", "9", txt, 2) print(x) # The9rain9in Spain
Match object: a match object is an object containing information about the search and the result.If there is no match, the value None will be returned, instead of the Match Object.
Example 15: do a search that will return a Match Object:
import re txt = "The rain in Spain" x = re.search("ai", txt) print(x) #this will print an object
The Match object has properties and methods used to retrieve information about the search, and the result:
.span() - returns a tuple containing the start-, and end positions of the match.
.string - returns the string passed into the function
.group() - returns the part of the string where there was a match
Example 16: print the position (start- and end-position) of the first match occurrence. The regular expression looks for any words that starts with an upper case "S":
import re txt = "The rain in Spain" x = re.search(r"\bS\w+", txt) print(x.span()) # (12, 17)
Example 17: print the part of the string where there was a match. The regular expression looks for any words that starts with an upper case "S":
import re txt = "The rain in Spain" x = re.search(r"\bS\w+", txt) print(x.group()) # Spain