Chapter 33 Final project - Proposal
Your project proposal serves two purposes:
- for you to end up with a solid plan for your project, and
- for me to evaluate whether your project fits my expectations and help you to improve in where it doesn’t
33.1 Project expectations
In your project, you will choose a data mining application domain and develop a data mining pipeline for a specific problem. The particular application area, techniques you choose to use, and the outputs of your project are open-ended and up to you to propose. Be creative!
However, I do have the following expectations for all projects:
- I expect you to design and implement a complete data mining pipeline from start to finish.
- I expect your project to be well-researched, appropriate, and achievable in the given time frame.
- If you’re unsure, come chat with me during office hours!
- You may use existing software/libraries to implement your pipeline. However, you must implement at least one component (i.e., data collection, data mining algorithm) of your project yourself.
- For example, if your project involves extensive data preprocessing and then the use of symbolic regression to build a model that you plan to compare to an alternative approach, you may choose to implement your symbolic regression algorithm and use existing libraries for your data preprocessing and for the alternative model.
- I expect you to either collect your own dataset or use a publicly available dataset.
- I expect you to evaluate your data mining approach in some way.
- For example, conducting a small study that compares your approach to an alternative approach, assessing the accuracy of your model by holding back validation data, accessing the robustness of your model, etc.
- I expect you to effectively communicate your methodology and findings in a final report.
- This will take the form of a well-organized and well-written explanations with publication-quality visualzations.
I will consider the overall difficulty of your project when determining whether or not it is sufficient (or infeasible) for your final project. For example, if your are collecting your own data, your pipeline will need to be simpler because of the extra time and effort requried for data collection.
33.2 Proposal guidelines
Your project proposal should be under 3 pages, be formatted under the following headings, and address each of the following points:
- Project overview
- Provide an overview of your project.
- What is the application domain you are working on?
- What specific problem are you working on?
- Briefly, what is your overall approach?
- Why? What is your motivation?
- What are your goals for this project?
- Related work
- What similar projects have been attempted in your proposed domain? How do they related to your project?
- Data plan
- What data do you plan to use for your project?
- If you plan on collecting your own data, how do you plan to do so?
- If you plan to use an existing dataset, which dataset do you plan to use and why? Who collected your data and for what original purpose?
- What proprocessing do you expect to need to perform on your data?
- Implementation plan
- Describe your proposed data mining pipeline
- How will you implement your data mining pipeline?
- What major component do you plan to implement yourself?
- What components do you plan to use existing software/libraries for? Which software/libraries do you plan to use?
- Evaluation plan
- How do you plan to evaluate your data mining algorithm?
- Provide a broad overview of how you plan to evaluate your approach. How will you measure success? Will you compare to an alternative method? Measure accuracy with a ground truth testing set?
- Plan for group collaboration
- What is your plan for collaborating as a group? Do you plan to meet regularly in-person, over zoom, or coordinate asynchronously on a messaging application like discord?
- How do you plan to collaboratively implement your data mining pipeline? How will you manage your code and your data?
- Timeline
- Outline your week-by-week goals for completing your project by 11/30
- Feel free to communicate this timeline as a bulleted list, a table, etc.
- References
- List your references.
Include a project title and the members of your group as part of your proposal. Your proposal should be organized into clearly labeled sections (one to address each of the eight categories above). Each section should be written in paragraph form (with the exception of the references and the timeline section where you may opt to use a list or a table). Your proposal will be graded on the quality of your writing as well as your answers to the previously listed prompts.
It should be evident that your group has thoroughly researched the area and formulated a plan to achieve your goals. I recognize that things may change during the course of the project, but I’m looking here to make sure you have a solid initial plan. Any significant project like this requires significant prework before implementation.
33.3 Grading
The proposal will be graded as follows (out of 100):
Item | Points |
---|---|
Project overview section addresses required points | 10 |
Related work section addresses required points | 15 |
Data plan section addresses required points | 10 |
Implementation plan section addresses required points | 15 |
Evaluation plan section addresses required points | 10 |
Plan for group collaboration section addresses required points | 10 |
Timeline section addresses required points | 10 |
Related work and references are included and cited appropriately | 5 |
Writing is appropriately formal | 5 |
Writing is clear | 5 |
Proposal is well-formatted | 5 |
I reserve the right to award flex points for exemplary work in any areas of this project.
All components of your final project are subject to GVSU and the School of Computing’s academic honesty policies. Violations of these policies may result in failure from the course.