Using the Cloud as a Media Hosting Option For SharePoint 2010 (Part 1–Amazon S3)


I know it’s been a while since I’ve posted anything to the blogsphere… I have to be honest to Abe, I had good reason. I have been working and working hard I should say. This means I have a lot of topics in the next upcoming months…

For this topic I’d like to focus on something my team and I have been working on and that’s SharePoint 2010 and Media files like videos, pictures and anything related to media. We recently had the requirement to create a SharePoint Web Application which used a ton of technologies. I’ll blog specifically about each at a later date, however one that caught my attention was the usage of Amazon S3 and Amazon CloudFront. In this post I’ll talk about how we are using Amazon S3, Windows Media Encoder SDK, and SharePoint to provide a great performing media experience to your SharePoint End Users.

For those that have worked with SharePoint before know that SharePoint… no matter what version… is not the best server or software for media, especially HD Videos, and high resolution pictures. Proof of this can be found in Microsoft’s attempt at supporting this through SQL’s Remote BLOB Storage (RBS). The RBS feature allows you the option of not storing BLOBs (Binary Large Objects) inline with other SharePoint Content Table entries. In other words, when you enter in items in a list, if one of the fields in the list is a File such as a large video file (the whole bit stream…), by default SharePoint stores that entry in the same SQL Content Table record entry as the other metadata details such as Title, Author, date and etc. Can you imagine the load this places on a server which needs to serve up 100’s if not 1000’s of entries all with their own large video file? While SharePoint does support this, your media experience will be horrible and that’s putting it politically correct. Acknowledging this, is one of the reasons Microsoft’s SQL team created RBS. RBS will allow you to offload your large files to another storage mechanism using “RBS Providers” which are effectively data storage providers. These providers work in sync with the inline metadata entries for a specified SQL Table and in this case your SharePoint List by communicating with RBS api’s and SQL.  There are many debatable reasons to either use the default (SQL FILESTREAM) or create your own providers (which by the way Microsoft only reccomends ISV’s do…), we have chosen to go down a different path. Maybe we will incorporate our design with the RBS provider architecture Microsoft provides… but then again maybe not. That’s another post for another time. With that, I’ll get into our design.

In order to understand our design, let’s first get started talking about our requirements. At first we thought some of the requirements were going to be way to complex to implement in the little amount of time given to implement this. However after a quick Proof Of Concept (POC) and a little light at the end of the tunnel things started going our way. To start, our requirements actually turned out to be simple, create a list which allows end users to upload vidoes, pictures, audio and other files. The files must not be uploaded on the Web Front end servers, but rather on Media Hosting servers so as to not put too much pressure and resource servicing on the Presentation Servers. Along with the first two requirements, we also had to convert and encode all the vidoes to a standard MP4 format, along with create a thumbnail, and validate the length 10minutes and less. The Architecture and implementation we put in place was done by my colleague (Soledad Pano) here. In her post she outlines the three major components she used to build it:

  1. Custom Upload Process: This is the front end of the solution. It consists of a custom list with a custom upload form. The list has the link to the media file and more metadata fields (title, author, date, keywords, etc). When you click on create a new item on the list the custom upload form is opened and you can browse for a file to upload. The form has the required validation logic and it serves to save the assets to the configured location, which can be a Sharepoint library or an external location, like File System or FTP server. When the upload finishes you are redirected to the list item edit form so you can enter the metadata. The experience is similar to uploading a file to a Sharepoint document library.

  2. Media Processing Backend Process: This consists of a timer job that queries the Media Assets list for items to process. It encodes the videos, generates thumbnail and poster images and uploads everything to the final destination. Finally, it notifies the user of the result of the process by email. For the video encoding we used the Microsoft Expression Encoder SDK. As I will explain later, this SDK cannot be used inside a Sharepoint process, so it runs in a separated process that is invoked from the timer job.

  3. Storage Manager: this is a flexible and extensible component that abstracts the logic of saving (and deleting) a file to the final location depending on the flavor chosen thru configuration (File System, Sharepoint library or FTP). This component is used both by the front end upload mechanism and the back end media processing job.

The above is take from Soledad’s post, In which she utilizes the Storage Manager component as one of the extensible points we can use to integrate with Amazon S3.

The Storage manager implements a single interface: IAssetStorageManager

public interface IAssetStorageManager
{
    void Delete(string fileUrl);
    string Save(System.IO.FileInfo file);
    string Save(string fileName, System.IO.Stream fileStream);
}

The above interface tells you everything you need to know about what to implement. If you want to upload a file or stream, use the “Save” methods. if you want to delete the file, use the “Delete” Method. So with Soledad’s design, I implemented a StorageManager that uploads to AmazonS3.

Getting Started with Amazon

To use AmazonS3, you must first sign up on Amazon’s site: http://aws.amazon.com/

image

Click the Sign Up button, enter in your information, give em’ a credit card, and you’re done. If you already have an Amazon account from purchasing books and other items, you can use that account.

Once you have an account, you can login to your AWS management console and get your AmazonS3 Access KeyID, SecretID key (Remember your Access Key ID and Secret Key ID, you’ll need this for access to the AmazonS3 through the API):

SNAGHTML6950e0

You can also navigate to the AmazonS3 Area (https://console.aws.amazon.com/console/home):

SNAGHTML3fb17b

Once inside AmazonS3, the AWS Panel is easy to navigate. If you want to create AmazonS3 Buckets you can, if you want to see which items are inside the Buckets you can and so on and so forth. The only catch is that the AWS Panel is rather lacking as far as functionality, no searching; no sorting; just really a look inside what you have.

SNAGHTML59950b

Create your bucket name, remember this name because it’s the name you will use for the AmazonS3StorageManager.

AmazonS3AssetStorageManager

The implementation is below:

   1: using System;

   2: using System.Collections.Generic;

   3: using System.Linq;

   4: using System.Text;

   5: using System.Net;

   6: using System.IO;

   7:  

   8: using Amazon.S3;

   9: using Amazon.S3.Model;

  10: using Amazon.S3.Util;

  11: using Microsoft.SharePoint.Administration;

  12: using Common.Logging;

  13:  

  14: namespace Common.AssetStorage

  15: {

  16:     public class AmazonS3AssetStorageManager : IAssetStorageManager

  17:     {

  18:  

  19:         private string bucketName = string.Empty;

  20:         private string keyPrefix = string.Empty;

  21:         private string accessKeyID = string.Empty;

  22:         private string secretAccessKeyID = string.Empty;

  23:  

  24:         AmazonS3 client;

  25:         private Logger logger;

  26:  

  27:         public AmazonS3AssetStorageManager(string _bucketName, string _keyName, string _accessKeyID, string _secretAccessKeyID)

  28:         {

  29:             // setup some class level variables

  30:             bucketName = _bucketName;

  31:             keyPrefix = _keyName;

  32:             accessKeyID = _accessKeyID;

  33:             secretAccessKeyID = _secretAccessKeyID;

  34:             logger = new Logger();

  35:         }

  36:  

  37:         public void Delete(string fileName)

  38:         {

  39:             // To Delete we need the exact key of the file that is in AmazonS3

  40:             // as well as the bucketName where the file exists

  41:             // In my implementation the key is the KeyPrefix-FileName

  42:             string uniqueKeyItemName = string.Format("{0}-{1}", keyPrefix, fileName);

  43:             DeleteObjectRequest deleteObjectRequest =

  44:          new DeleteObjectRequest()

  45:          .WithBucketName(bucketName)

  46:          .WithKey(uniqueKeyItemName );

  47:                        

  48:             using (client = new AmazonS3Client())

  49:             {

  50:                 try

  51:                 {

  52:                     client.DeleteObject(deleteObjectRequest);

  53:                     logger.LogToOperations(Categories.Media, EventSeverity.Information, "Amazon Object KeyID: {0} deleted successfully", uniqueKeyItemName   );

  54:                 }

  55:                 catch (AmazonS3Exception s3Exception)

  56:                 {

  57:                     logger.LogToOperations(s3Exception, Categories.Media, EventSeverity.ErrorCritical,

  58:                                            "Error Occurred in Delete operation for ObjectKeyID: {0}", uniqueKeyItemName );

  59:                 }

  60:             }

  61:            

  62:         }

  63:  

  64:         public string Save(FileInfo file)

  65:         {

  66:             using (FileStream fileStream = new FileStream(file.FullName, FileMode.Open))

  67:             {

  68:                 return Save(file.Name, fileStream);

  69:             }

  70:  

  71:         }

  72:  

  73:         public string Save(string fileName, Stream fileStream)

  74:         {

  75:             // ToUpload to AmazonS3, we can use either HTTP or HTTPS

  76:             // When using HTTPS, you have to make sure you have the AmazonS3 x.509 cert

  77:             // trusted in your cert store

  78:             AmazonS3Config S3Config = new AmazonS3Config()

  79:             {

  80:                 ServiceURL = "s3.amazonaws.com",

  81:                 CommunicationProtocol = Amazon.S3.Model.Protocol.HTTP,

  82:             };

  83:  

  84:             using (client = Amazon.AWSClientFactory.CreateAmazonS3Client(

  85:                     accessKeyID, secretAccessKeyID, S3Config ))

  86:                 {

  87:                     return UploadToAmazon(fileName, fileStream);

  88:                 }

  89:           

  90:         }

  91:  

  92:         string  UploadToAmazon(string fileName, Stream fileStream)

  93:         {

  94:             try

  95:             {

  96:               

  97:                 string uniqueKeyItemName = string.Format("{0}-{1}", keyPrefix, fileName);

  98:                 PutObjectRequest request = new PutObjectRequest();

  99:                 request.WithInputStream(fileStream);

 100:                 request.WithBucketName(bucketName)

 101:                     .WithKey(uniqueKeyItemName);

 102:                 request.WithMetaData("title", fileName);

 103:                

 104:                 // if header is needed...

 105:                 // Add a header to the request

 106:                 // You can add custom header values as well as specific values to use in AmazonS3

 107:                 // for Querying the item in AmazonS3

 108:                 //request.AddHeaders(AmazonS3Util.CreateHeaderEntry ("ContentType", contentType));

 109:  

 110:                 // Here we explicitly allow all users the ability to read the specific file.

 111:                 // We can put security policies on files, buckets, and other items in AmazonS3

 112:                 S3CannedACL anonPolicy = S3CannedACL.PublicRead;

 113:                 request.WithCannedACL(anonPolicy);

 114:                 S3Response response = client.PutObject(request);

 115:                 

 116:  

 117:                 // if you want to create a temporary  AmazonS3 upload location based on some expired time

 118:                 // for security reasons you can do the below:

 119:                 //GetPreSignedUrlRequest publicUrlRequest = new GetPreSignedUrlRequest().WithBucketName(bucketName).WithKey( uniqueKeyItemName ).WithExpires(DateTime.Now.AddMonths(3) );

 120:                 //var urlResponse = client.GetPreSignedURL(publicUrlRequest);

 121:  

 122:                 response.Dispose();

 123:  

 124:                 // otherwise the url will be a public url 

 125:                 // which is always https://s3.amazonaws.com/[yourBucketName]/[YourUploadedFileKey]

 126:                 var urlResponse = string.Format("https://s3.amazonaws.com/{0}/{1}", bucketName, uniqueKeyItemName );

 127:                 return urlResponse;

 128:             }

 129:             catch (AmazonS3Exception amazonS3Exception)

 130:             {

 131:                 if (amazonS3Exception.ErrorCode != null &&

 132:                     (amazonS3Exception.ErrorCode.Equals("InvalidAccessKeyId")

 133:                     ||

 134:                     amazonS3Exception.ErrorCode.Equals("InvalidSecurity")))

 135:                 {

 136:             

 137:                     logger.LogToOperations(amazonS3Exception, Categories.Media, EventSeverity.ErrorCritical,

 138:                                            "Error - Invalid Credentials - please check the provided AWS Credentials");

 139:                     return null;

 140:                 }

 141:                 else

 142:                 {

 143:                     logger.LogToOperations(amazonS3Exception, Categories.Media, EventSeverity.ErrorCritical,

 144:                                                    "Error occured when uploading media: {0}",amazonS3Exception.Message );

 145:                     return null;

 146:                 }

 147:             }

 148:         }

 149:     }

 150:     

 151: }

The AmazonS3 Storage Manager implements the 3 methods Save(x2) and Delete. To programmatically work with AmazonS3, you have a variety of options. You can use the REST interface, SOAP interface, or use the SDK Frameworks provided by Amazon and the community.

Using the REST interface is the Rawest way to communicate and by far the most complex, and in my opinion the most powerful way. When using the REST interface, many different programming languages and environments are supported, such as JavaScript, Perl, PHP, Java, .Net (C#, F#, VB.Net, C++), Ruby, Node.js, objective C++, and the essence of this article SharePoint development. The catch is using the RAW interface requires deep knowledge of basic Web programming stacks, such as manipulating HTTP Headers, working with the HTTP Body, HTTP Response and streams.

Using the SOAP Interface is easier to work with if you’ve ever done Web Service Programming. Just create a proxy based on the SOAP WSDL, and you can write code against the WSDL generated proxies. Most programming languages support proxy generation. The only caveat here is when working with certificates, security policies and all things related to security the SOAP interface can easily get in the way and cause headaches and issues later down the road.

Using SDK’s for specific programming languages is the simplest approach because it hides and encapsulates all the gory details from the previous two methods: REST and SOAP. Underneath the SDK frameworks, they use a mixture of REST and SOAP calls depending on the SDK. Luckily for SharePoint Developers, there is a .Net SDK available that even includes a VS.NET Plugin and project template that expedites your usage of Amazon S3 into the SharePoint world. If you navigate to this URL: https://aws.amazon.com/net/ you will find a ton of videos, sample code, articles and etc, pointing you to how to best write code against the .Net SDK.

Thus, for the purpose of speed, I will outline my steps used to work with the .NET SDK for Amazon to upload media to a SharePoint List using the AmazonS3AssetStorageManager.

Using the Amazon S3 .NET SDK

The first thing I did was to download the SDK for .Net and install it inside the VS.NET 2010 development environment. Once that was done, I opened a dummy project inside VS.NET to see how the AWS project template lead me to interact with AWS.

SNAGHTML87051f

I let the wizard run and I examined the code produced by the project template wizard. The Wizard first asked me to enter in my AWS Credentials and setup my account information:

SNAGHTML88e24c

I typed in my display name, Access Key ID, and Secret Access Key. The Account number is optional. Immediately a project opened with all the information needed to programmatically work with Amazon S3 in Visual Studio .Net 2010:

SNAGHTML8e5a62

The actual source code revealed that was of importance to my Amazon S3 implementation was this:

   1: // Print the number of Amazon S3 Buckets.

   2:               AmazonS3 s3Client = AWSClientFactory.CreateAmazonS3Client();

   3:  

   4:               try

   5:               {

   6:                   ListBucketsResponse response = s3Client.ListBuckets();

   7:                   int numBuckets = 0;

   8:                   if (response.Buckets != null &&

   9:                       response.Buckets.Count > 0)

  10:                   {

  11:                       numBuckets = response.Buckets.Count;

  12:                   }

  13:                   sr.WriteLine("You have " + numBuckets + " Amazon S3 bucket(s) in the US Standard region.");

  14:               }

The code above simply lists the buckets based on my Amazon    S3 account and credentials. This lead me to investigate the AWSClientFactory, and AmazonS3Client  classes. Lo and Behold there are many methods, and classes that make it very easy to upload entries into your amazon bucket. The main object to work with is the AmazonS3Client object. this object allows you to delete, create, upload, modify and retrieve almost any item in AmazonS3. The object supports synchronous and asynchronous calls for more efficient development and implementation patterns.

So basically, all I needed was the AmazonS3Client, the media file stream, the file stream name, bucketName, AccessID, and Secret ID values to upload the item into the AmazonS3 Bucket. Once I uploaded the media into the bucket I simply returned the URL to where this item resided inside AmazonS3. I won’t go into the specifics as you can step through the code from above and read the very detailed documentation Amazon has.

So you may be wondering how I got it all to run with Soledad’s design?

Well, getting all this to run, required me to hook up to the Storage manager. In Soledad’s design she also makes use of the “Factory” pattern. Thus, there is a factory class that will give you the particular storage manager based on a setting inside the config store list. All configuration is saved in a Sharepoint list, and the Sharepoint Config Store is used for retrieving it. Here is part of the factory code outlining the AmazonS3StorageManager class:

   1: public class AssetStorageFactory

   2:

   3:  static public IAssetStorageManager GetStorageManager(string configCategory,string webUrl)

   4:  {

   5:      var configHelper = new ConfigHelper(webUrl);

   6:      string storageMethod = configHelper.GetValue(configCategory, StorageMethodConfigKey);

   7:      if ("AmazonS3".Equals(storageMethod, StringComparison.InvariantCultureIgnoreCase))

   8:      {

   9:          return new AmazonS3AssetStorageManager("test", "devPrefix", "AB123-myAmazonKEYID-ZZZ", "123ABC-MySecretKeyID-ZZZ");

  10:      }

  11:      else if (true)

  12:      {

  13:          // ...Other Storage Managers 

  14:      }

  15:      throw new ArgumentException(String.Format("Incorrect configuration Value '{0}' in ConfigStore for category '{1}' and key '{2}'. Supported options are: '{3}'",

  16:          storageMethod, configCategory, StorageMethodConfigKey, "FileSystem|FTP|SPLibrary"));

  17:  }

  18: }

Once everything was hooked up, all I had to do was deploy the components and configure the entries inside the Config Store and rest is… “ourstory”

In case you’re wondering what the requirements are for deploying only the AmazonS3 part, that’s even more simpler. The AmazonS3 SDK provides all of its functionality inside 1 Assembly (AWSSDK.dll), by default install it can be found here: C:\Program Files (x86)\AWS SDK for .NET\bin. The best thing about this assembly is that it’s strongly named. This means that to install it on SharePoint, just install it /register it inside the Global Assembly Cache (GAC). This assembly will need to be installed on all servers that need to run or interact with the AmazonS3 system.

 

Happy SharePointing!!!

9 thoughts on “Using the Cloud as a Media Hosting Option For SharePoint 2010 (Part 1–Amazon S3)

  1. I always enjoy learning what other people think about Amazon Web Services and how they use them. Check out my very own tool CloudBerry Explorer that helps manage S3 on Windows . It is freeware.

    Like

  2. Thanks very much for posting this. I’ve been looking at some similar in order to work with large GIS datasets in SharePoint. I have a quick question if you don’t mind. One of the drawbacks in my opinion of using RBS is that the data is still considered part of the total site collection size. Could this be adapted to essentially put a link in the SharePoint list that allows a user to upload / download directly from AWS without it touching SharePoint at all?

    I think that is what you’re already doing in a way. I’m not a developer but am researching this to determine capability prior to creating the project for the developers.

    Thanks again.

    Like

  3. Thanks for replying.

    I am probably wrong about this but from what I understand StoragePoint is still using RBS which means that SharePoint still considers it as part of the site collection size.

    Ref – http://sharepoint.microsoft.com/blog/Pages/BlogPost.aspx?pID=988
    “The content database size includes both metadata and BLOBs regardless of where the BLOBs are located and use of RBS does not bypass or increase these limits.

    What I am looking to do is essentially bypass the SharePoint limits altogether. That way the only thing that SharePoint sees is the link, and the file itself isn’t included as part of site collection size.

    Kind of like a dropbox solution, you hit upload and it sends the file somewhere else and just leaves a download link in the list.

    Perhaps it isn’t possible. Thanks for your feedback though.

    Like

  4. MR_Sheister. Your link idea is completely possible, and something we to often in our solutions. If you are interested in additional information, feel free to reach out to me at brandong@attunix.com. Thanks, Brandon.

    Like

Leave a comment