Day 6: Multiple Linear Regression: Predicting House Prices

  • AffineStructure 9 years ago + 2 comments

    Writing this in R was frustrating. Some things I learned on my way:

    1. How read-in data with readLines and to use strsplit to put the data neatly into dataframes

    2. how to extract the coefficients from lm to use with multiplying with the dataframes

    3. I struggled with the question how do you properly input the lm without fixed number of vectors, since I had problems with the datatype. I realized finally that we can take a matrix as an input so we had to do something like lm(df1[length(df1)] ~ as.matrix(df1[-(length(df1))]))

    4. how do you print something like this properly since R makes an annoying index at the beginning of all of your prints. I used a for loop and the cat function with a line break. hacker rank recieved the answer nicely that way.

    Add Reply Preview cancel

    Sorry, you do not have a permission to answer to this question.

    • BryanRJ 9 years ago + 1 comment

      Or you can do "lm(price ~ ., data=foo)", which will create a linear regression model having price as the independent variable and all other columns in foo as dependent variables.

      Re: printing, try "write(foo, stdout())", which will be a non-pretty print.

      Add Reply Preview cancel

      Sorry, you do not have a permission to answer to this question.

      • AffineStructure 9 years ago + 1 comment

        when you did "lm(price ~ ., data=foo)" did you have the data saved in data frames?
        I was having trouble doing it this way because price needs to be assigned to the last vector. did you just rename the final entry to price in the data frame or did you just have all the vectors floating free in the global environment?

        Add Reply Preview cancel

        Sorry, you do not have a permission to answer to this question.

        • BryanRJ 9 years ago + 0 comments

          The "data=foo" argument there tells R that data are supposed to come from the frame named "foo".

          I always recommend building a data frame before performing regression or other modeling. It makes things easier in R.

          Add Reply Preview cancel

          Sorry, you do not have a permission to answer to this question.

      • alexey_filippov 9 years ago + 0 comments

        On strsplit, actually scan does the same out of box:

        nums <- suppressWarnings(readLines(file("stdin")))
        
        fn <- scan(text=nums[1])
        f <- fn[1]
        n <- fn[2]
        

        On the coefficients, there's no need to extract anything: as soon as you've got the fitted model, you can use predict to apply it to a data frame of queries.

        On printing, StackOverflow users suggest something along the lines of,

        write.table(cat(format(answer, nsmall=1), sep="\n"), sep = "", append=T, row.names = F, col.names = F)
        

        Frankly, I don't really understand how cat and write.table interact here, but it seems to work just fine.

        Add Reply Preview cancel

        Sorry, you do not have a permission to answer to this question.

      1. Challenge Walkthrough
        Let's walk through this sample challenge and explore the features of the code editor.1 of 6
      2. Review the problem statement
        Each challenge has a problem statement that includes sample inputs and outputs. Some challenges include additional information to help you out.2 of 6
      3. Choose a language
        Select the language you wish to use to solve this challenge.3 of 6
      4. Enter your code
        Code your solution in our custom editor or code in your own environment and upload your solution as a file.4 of 6
      5. Test your code
        You can compile your code and test it for errors and accuracy before submitting.5 of 6
      6. Submit to see results
        When you're ready, submit your solution! Remember, you can go back and refine your code anytime.6 of 6
      1. Check your score