1. Big data processing using Apache Mahout and Hadoop.
2. Automated class attendance using face recognition with opencv over MS Kinect device.
3. GIS crime mapping tool.
4. Web Crawling Internet contents to answer queries, using keyword mapping to particular topics, image processing, NLP,
5. Green Energy GIS decision support System, for demand/supply analysis, siting optimisation, environmental impact analysis, future prediction, and simulation tools (time permits all will be attempted, and can be completed and extended in future projects).
6. Building an interoperable cloud storage that spans the free services from the various providers: Dropbox, Google drive, Amazon drive, MS sky drive, Ubuntu one, … etc. Future ideas can extend to provide more apps for the various devices (IOS, Android, windows, mac, linux) to access the virtual drive as one. Future ideas can provide automatic backup and restore, replication for fault tolerance,
The following ideas are general descriptions for classes of applications that can be addressed using different technologies, and with varying scope.
A) Cloud Computing for Web-services & Parallel Processing
Consider any web services project of choice and running it on any available cloud such as the AWS EC2, and compare performance and cost with other stand alone or networked clusters of servers. This can be a web application hosted on normal web servers, and/or any parallel program comparing the performance on multi-core hardware, clusters of networked computers, and the EC2.
- Measuring Webhosting, file-sharing, database availability, performance and cost on the different architectures.
- Parallel processing performance and cost comparison on the different architectures.
- Writing comparison report about positives and negatives of the different architectures, and suitability for the different applications.
B) GPU vs CPU performance evaluation
Develop an image or video processing application like feature extraction, pattern recognition, or any graphics text book problem and implement it using CUDA toolkit on a GPU, and report the normal CPU performance compared to the GPU.
- Learn CUDA GPU development toolkit.
- Compare GPU vs CPU performance
- Compare GPU vs other parallel architectures, like multi-core, clusters of computers, or any accessible High Performance Computers.
C) High Dimensional Data Analysis
Experimenting with any high dimensional dataset such as those from UCI (University of California, Irvine) datasets from the machine learning repository by the Centre for Machine Learning and Intelligent Systems, and apply various Multivariate Statistical methods from R or matlab, compare and report results.
D) Understanding Crowed Serviced Web Contents
Use web crawlers to download data available in the internet public domain, and apply natural language processing and learning techniques to extract information of interest to specific or general queries.
E) Automatic Traffic & Crowed Management
Build a traffic and crowds management system, using information collected from cell phone access points, street cameras, radar cameras, satellite images, online location tagged posts on twitter, facebook, or blogs, or any accessible information source, to estimate the number of people in a given location at a given time, and the exits and entries routes to that location. Features like safe entries and exits need to be estimated, emergency situation detection and intervention methods, controlling crowds and leading them out of hazards scenarios.
F) Building Ontologies, specifically Legal ontologies
G) Estimating Colleges Admission GPA requirements (Tansik) based on the marks in each subject in high school for each specialisation and the society needs, using Numerical Methods.
H) Connecting Sensors to cell phones for continuous geo-tagged updates to public databases. For example weather updates using temperature and humidity sensors.
I) Building GIS Systems for various objectives, including but not limited to: house hunting, administrative data visualization, utilities management, market research, environmental monitoring
Previous & Ongoing Experiments:
1. My Distributed MSA: https://github.com/mhelal/mmDST
2. Clusters boundaries cut-offs using distance matrix gaps.
3. Programming Assignments Marking System: https://github.com/mhelal/marks
4. Curriculum Mapping System
5. Class Scheduling System
6. Web crawling Python script usage with the various social networking APIs