In part 1 of this series I described the functionality of an aspect of the facilities and real estate services web site, part of the energy management aspect of the sustainability program. Since this functionality depends on the ability to gather data from an FTP server, that's the first functionality I implemented. My language of choice for this work is python. The reason to choose python is mainly because it is well supported by AWS, I know it better then JavaScript (which is also well supported on the AWS platform) and it's well supported on development systems I tend to use, running Windows and Linux.

Amazon Web Services logo

Given that the data is coming from an FTP server, which is secured with a username and password, the first thing to do is to store that information somewhere. Generally, best practice is to store those credentials separate from the application and on most modern web servers, credentials can be encrypted and are generally stored in a location that is not part of the normal web content, so it can not be accidentally exposed as a web document. I'm using a similar model, though I am not encrypting the authentication data, since this is not considered highly confidential. It's worth noting that the current implementation does not encrypt that data either.

generic bucket graphic

I chose to use an S3 bucket to store the FTP information as well as the retrieved logs, given the low confidentially requirements. of course I can easily update this to use separate buckets, which would allow me to further separate the authentication data from the stored environmental data. AWS has a nice Identity and Access Management system and it's beyond the scope of this blog to go deeply into its functionality. Suffice it to say, I have defined a role that is allowed to access the S3 bucket containing the data it needs to operate. The code that I build will run under that role, and I restrict its access to the exact privileges it needs to perform the work it needs to do.

I start with defining a configuration file that can be easily parsed with the python ConfigParser. As the documentation for this class states: 'The ConfigParser class implements a basic configuration file parser language which provides a structure similar to what you would find on Microsoft Windows INI files'. The config file looks something like this:

[ftp-server]
server: 192.168.1.2
username: exampleuser
password: secretpassword

AWS Lambda logo

As mentioned I store this config file in an S3 bucket (named fres-data.es.isc.upenn.edu for now) which the Lambda function that I will eventually use to connect to the FTP server can read, since I configured the role under which the Lambda function runs to have access to that bucket. It's worth mentioning that the same code can be run from the command line on my workstation, since the AWS libraries make this completely transparent. You can use the AWS console to upload the file to S3, or you can use the AWS CLI equivalent:

aws s3 cp ./fres_data.ini s3://fres-data.es.isc.upenn.edu/fres_data.ini

The code to read the config file will now look something like this:

def get_config_from_s3():
    bucket_name = 'fres-data.es.isc.upenn.edu'
    config_file = 'fres_data.ini'
    s3 = boto3.resource('s3')
    bucket = s3.Bucket(bucket_name)
    cfg = (bucket.Object(config_file)).get()
    config = ConfigParser.ConfigParser()
    config.readfp(io.BytesIO(cfg['Body'].read()))
    return config

In a production environment, this will contain code to catch errors, such as a non-existing bucket, invalid file format, etc. But the nice thing about this way of working is that you can secure your credentials and configuration in an S3 object that is only accessible by code that is running using a role that you define with the constraints you require. Of course it is also quite easy to define a role that has access to all bucket contents, so you sill have to do your due diligence when using this pattern.

 

If you have any comments, questions or other observations, please contact me directly via email: vmic@isc.upenn.edu.