Posts

Showing posts from 2020

How to read CSV file using Pyspark?

Reading data files using pyspark is pretty much easy with simple command. However you need to add options depending on the format & content of the file. Simple file read command: Reading CSV file by passing header Other options you can add are escape : "\"" multiLine : "true" delimiter : "|" nullValue : "\\N" inferSchema : "true"