Day 6: Multiple Linear Regression: Predicting House Prices

Sort 7 Discussions, By:

Sorry, you do not have a permission to answer to this question.

  • alexey_filippov 9 years ago + 2 comments

    R is an amazing language indeed: I struggle to write a concatenation, but linear model is so easy! lm, and you're done! And there's even more!

    Still, even reading stdin is a pain. On the plus side, all the stats stuff just works.

    Add Reply Preview cancel

    Sorry, you do not have a permission to answer to this question.

    • AffineStructure 9 years ago + 2 comments

      Writing this in R was frustrating. Some things I learned on my way:

      1. How read-in data with readLines and to use strsplit to put the data neatly into dataframes

      2. how to extract the coefficients from lm to use with multiplying with the dataframes

      3. I struggled with the question how do you properly input the lm without fixed number of vectors, since I had problems with the datatype. I realized finally that we can take a matrix as an input so we had to do something like lm(df1[length(df1)] ~ as.matrix(df1[-(length(df1))]))

      4. how do you print something like this properly since R makes an annoying index at the beginning of all of your prints. I used a for loop and the cat function with a line break. hacker rank recieved the answer nicely that way.

      Add Reply Preview cancel

      Sorry, you do not have a permission to answer to this question.

      • BryanRJ 9 years ago + 1 comment

        Or you can do "lm(price ~ ., data=foo)", which will create a linear regression model having price as the independent variable and all other columns in foo as dependent variables.

        Re: printing, try "write(foo, stdout())", which will be a non-pretty print.

        Add Reply Preview cancel

        Sorry, you do not have a permission to answer to this question.

        • AffineStructure 9 years ago + 0 comments

          when you did "lm(price ~ ., data=foo)" did you have the data saved in data frames?
          I was having trouble doing it this way because price needs to be assigned to the last vector. did you just rename the final entry to price in the data frame or did you just have all the vectors floating free in the global environment?

          Add Reply Preview cancel

          Sorry, you do not have a permission to answer to this question.

        • alexey_filippov 9 years ago + 0 comments

          On strsplit, actually scan does the same out of box:

          nums <- suppressWarnings(readLines(file("stdin")))
          
          fn <- scan(text=nums[1])
          f <- fn[1]
          n <- fn[2]
          

          On the coefficients, there's no need to extract anything: as soon as you've got the fitted model, you can use predict to apply it to a data frame of queries.

          On printing, StackOverflow users suggest something along the lines of,

          write.table(cat(format(answer, nsmall=1), sep="\n"), sep = "", append=T, row.names = F, col.names = F)
          

          Frankly, I don't really understand how cat and write.table interact here, but it seems to work just fine.

          Add Reply Preview cancel

          Sorry, you do not have a permission to answer to this question.

        • andreymir 9 years ago + 1 comment

          Easy to read data with read.delim(file="stdin", header = FALSE, sep = " ")

          Add Reply Preview cancel

          Sorry, you do not have a permission to answer to this question.

          • prlpzb 9 years ago + 0 comments

            I agree that using read.delim o read.table is a lot easier than reading lines and then trying to build vectors and frames. The trick is to read the whole imput as a data frame an then slice it to get the problem data.

            Example:

            tot=read.table(file="stdin",header=FALSE,fill=TRUE,sep=" ")

            f=tot$V1[1]

            n=tot$V2[1]

            origen=tot[2:(n+1),]

            t=tot$V1[n+2]

            desti=tot[(n+3):(n+2+t),]

            Add Reply Preview cancel

            Sorry, you do not have a permission to answer to this question.

        • AlejandroBlanco 9 years ago + 0 comments

          49.13... What a pain!

          My results on the test case are:

          82.28545033210533

          159.9594001121738

          138.99344089799777

          117.35990799068198

          with a score of 0.965.

          I was wondering if someone was having similar results. I used Java by the way. Do you think the double type rounding could be responsible? There could be something wrong in the calculus, but I think is straight forward. If there's a problem, the whole code would crash and wouldn't give me an approximate result.

          Add Reply Preview cancel

          Sorry, you do not have a permission to answer to this question.

          • andreymir 9 years ago + 0 comments

            Trying to solve this in R and the answer I'm getting for the sample data is close:

            105.214558351069
            142.670951307299
            132.936054691247
            129.701754045025
            

            and the sample output is:

            105.22
            142.68
            132.94
            129.71
            

            But it is not presise, so I'm not getting the full score although other two tests marked as passed when I submit this solution. But it is rated with 30.38 scores. I'm using the lm function to calcuate coefficients.

            Add Reply Preview cancel

            Sorry, you do not have a permission to answer to this question.

            • nicocai 9 years ago + 0 comments

              test case #2 always over timed, while others just fine. Any tips?

              Add Reply Preview cancel

              Sorry, you do not have a permission to answer to this question.

              • geminigal 9 years ago + 0 comments

                The next problem starts when the contest ends. Can that be correct?

                Add Reply Preview cancel

                Sorry, you do not have a permission to answer to this question.

                1. Challenge Walkthrough
                  Let's walk through this sample challenge and explore the features of the code editor.1 of 6
                2. Review the problem statement
                  Each challenge has a problem statement that includes sample inputs and outputs. Some challenges include additional information to help you out.2 of 6
                3. Choose a language
                  Select the language you wish to use to solve this challenge.3 of 6
                4. Enter your code
                  Code your solution in our custom editor or code in your own environment and upload your solution as a file.4 of 6
                5. Test your code
                  You can compile your code and test it for errors and accuracy before submitting.5 of 6
                6. Submit to see results
                  When you're ready, submit your solution! Remember, you can go back and refine your code anytime.6 of 6
                1. Check your score