Falconcode Schema. Current version as of 15 March 2023.



Table Descriptions


1. Courses

Describes each semester/offering of our Introduction to Computing Course. Each course typically has 400-600 students.


Data Element Description
id A unique identifier for this course
semester A string describing the name of a course
year The calendar year that this course took place (e.g., 2021)

Number of Records

TBA


Additional Notes

  • Courses 2-4 are currently populated as of version 1.0. Courses 5-6 will be added in Summer 2023.

2. Students

Describes an individual enrolled in a course. A student may appear more than once in this table if they took our introductory course more than one time.


Data Element Description
id A unique, random identifier. This identifier will be regenerated with each FalconCode release.
course_id The id of the course that this student was enrolled in.
stem_major Student's academic major at the time of data collection. Value is 1 if the student IS a STEM major; 0 if the student IS NOT a STEM major; and -1 if the student is undeclared OR disenrolled.
gender Student's reported gender. M = Male, F = Female
ethnicity Student's reported ethnicity.

Number of Records

TBA


Additional Notes

  • Student rosters for courses 2-4 are currently populated as of version 1.0.

3. Runs

Describes each execution of students' programs.


Data Element Description
student_id The student's random (i.e., anonymized) identifier from the students table.
course_id The course that the student was enrolled in while executing this program.
problem_id The name of the problem that the student is trying to solve.
timestamp The server time when the student's code was run.
score The score reported by the unit test. This value is -1 if the student ran the program locally (i.e., did not submit for a grade).
code_hash The hash of the student's code submission.

Number of Records

TBA


Additional Notes

  • None.

4. Code Samples

Catalogues each unique student code submission.


Data Element Description
hash The has of the entire code submission.
source_code The student's program.
redacted A flag indicating if our redaction script removed one or more blacklisted tokens. This value is 1 if a redaction occurred, and 0 if a redaction did not occur.

Number of Records

TBA


Additional Notes

  • The anonymizing script is continuously being updated and improved upon. If you see a specific error, please contact the errors so that we can continue to add to our blacklist.
  • We cannot guarantee that anonymized programs will run exactly as it did on the student's computer.

5. Problems

Catalogues each unique student code submission.


Data Element Description
id The "name" of the problem. This value is unique for a given course.
course_id The course that this problem was assigned to.
type The type/complexity of this problem. Possible values are:
  • skill: a small program (3-5 lines)
  • lab: a medium sized program (5-30 lines)
  • project: a multi-stage program (100-500 lines)
exam 1 if this problem was assigned to students during an exam. 0 otherwise
testcase The unit test used to grade/evaluate this code.
max_score The maximum score achievable from the unit test.
input_str 1 if the program is required to get a string input from the user. 0 otherwise
input_cast 1 if the program is required to convert an input from the user into another type (e.g., float, int). 0 otherwise
output 1 if the program is required to print something to the terminal. 0 otherwise
assignment 1 if the program is required to create a variable and/or update its value. 0 otherwise
conditional 1 if the program is required to use a conditional statement (single or nested). 0 otherwise
function_call 1 if the program is required to call a non-standard function (i.e., not print or input). 0 otherwise
function_def 1 if the program requires a custom function to be defined. 0 otherwise
function_return 1 if the program requires a custom function that returns a value. 0 otherwise
loop_counting 1 if the program is required to create a loop that executes a specific number of times. 0 otherwise
loop_until 1 if the program is required to create a loop that continues until a condition is met. 0 otherwise
loop_elements 1 if the program is required to loop through the elements of a collection (i.e., a list). 0 otherwise
loop_nested 1 if the program requires a nested loop. 0 otherwise
stat_calculate 1 if the program is required to calculate a statistic (e.g., max, min, average, std dev). 0 otherwise
file_read 1 if the program needs to read a file. 0 otherwise
file_write 1 if the program needs to write to a a file. 0 otherwise
list 1 if the program needs to store values in a list. 0 otherwise
list_2d 1 if the program needs to store values in a multi-dimensional list. 0 otherwise
dictionary 1 if the program needs to store values in a dictionary. 0 otherwise
item_set 1 if the program needs to store values in a set. 0 otherwise
tuple 1 if the program needs to store values in a tuple. 0 otherwise

Number of Records

TBA


Additional Notes

  • Metadata tags are based on what how we intended for the problem to be solved. In reality, students find all sorts of ways to solve our problems, and may end up developing a unique solution.