Chapter 36 Final project - Final report
Documenting your work is as important as building your data mining pipeline and generating results. If you don’t document your pipeline and your results and present them in an easily understood format, no one will ever know what you found!
36.1 Format
Your final report can be prepared in a format of your choice. However, it must be clearly organized, use consistent formatting for citations/references, contain the required contents (see below), and I must be able to access it. To be safe, I would advise you to check with me before moving forward with a format. I cannot grade your report if I cannot access it. It is your responsibility to make sure that I can access it before you submit. (don’t rely on me to answer my email after 5:00pm on the day your assignment is due 😉)
Example formats you might choose to compose your report in:
- Using R markdown to create a web-accessible HTML page.
- This could support interactivity and allow you to embed code into your report.
- Using R markdown and Bookdown to create an eBook.
- This could support interactivity and allow you to embed code into your report.
- Using LaTex to create a PDF
- If you use LaTex, I suggest using Overleaf–it’s like google docs, but for LaTex.
- Using a conventional text editor to create a PDF
If you plan to use this project in a larger portfolio to send to potential employers, I strongly recommend making your report into a well-formatted (and interactive if possible) web page. I am happy to explain how to do so if you are interested.
36.2 Contents
Your report should include the following sections:
- Introduction
- Related work
- Methods
- Results and Discussion
- Conclusion
- Data and Software Availability
- References
You can (and probably should) include subsections within those sections. You may also embed code and its output directly into your document (e.g., as you would if using R markdown).
If you want to deviate from this structure for some reason, you need to speak with me ahead of time (at least a week before the deadline). Good reasons for deviating from the required sections would be if you plan to submit your work to a conference or competition that requires a different organization, you are creating an interactive document that lends itself to a different organization.
Below, I outline the points you must address in each section:
36.2.1 Introduction
Your introduction should:
- introduce your application domain and the specific problem you’re working on, providing enough background for readers to understand the value of your project
- motivate your project
- overview your approach
- overview your results
36.2.3 Methods
Your methods section should:
- describe how you chose your data and/or how you collected your data
- describe your data mining pipeline in sufficient detail such that someone could replicate your project
- What data preprocessing did you perform? Which data mining algorithms did you apply (and how)? What postprocessing did you do?
- describe any analyses you performed to evaluate your model
- describe the software
You may wish to include diagrams that overview methodology.
36.2.4 Results and discussion
You results and discussion section should describe your results and discuss how your results fit into a broader context. You should include publication-quality visualizations that effectively communicate your results. Depending on the format you use for your final report, you may also wish to embed interactive visualizations.
For example, you might present the performance of your data mining approach and discuss how it compares to a baseline/alternative method.
36.2.5 Conclusion
Your conclusion should:
- very briefly summarize your findings
- discuss any limitations/shortcomings of your approach/experiments
- suggest future work (e.g., possible extensions to your project)
36.2.6 Data and software availability
Your data and software availability section should:
- provide a working link to where your software can be found and downloaded (e.g., a github repository)
- provide a working link to where your data can be downloaded
If either of those are not feasible for your project (e.g., you collected your own dataset and it’s huge), let me know at least a week before the deadline (the sooner you let me know, the better).
36.2.7 References
Your references should be consistenly formatted such that a reader can find each reference (e.g., include authors, year, title, venue/journal, DOI).
If you use Rmd, Bookdown, or LaTex to create your report, I can help you configure things such that your references are automatically formatted for you.
36.3 Deliverables
Submit your report through Blackboard. If your report is a .pdf file, attach it to the submission. If your report is a live webpage, include a (working) link.
One submission per group. Include in the comments section of the assignment the names of the group members.
As part of your report, you will need to include working links to your code and data (e.g., in a github repository).
36.4 Grading
Your final report will be graded on the following out of 100:
Item | Points |
---|---|
Title accurately describes your project/findings | 5 |
Introduction | 10 |
Related work | 10 |
Methods | 10 |
Results and discussion | 10 |
Conclusion | 10 |
Data and software availability | 10 |
References are consistently formatted | 5 |
Code can be compiled/executed using provided instructions | 10 |
Report is well-organized into proper sections, paragraphs follow logical structure | 10 |
Report is sufficiently professional (few spelling and gramatical errors) | 10 |
I reserve the right to award flex points for exemplary work in any areas of this project.
All components of your final project are subject to GVSU and the School of Computing’s academic honesty policies. Violations of these policies may result in failure from the course.