Hacker Newsnew | past | comments | ask | show | jobs | submit | hassiktir's commentslogin

I dont understand how this isnt bigger news?

Local emergency services were basically nonfunctioning for better part of the day along with the heat wave and various events, seems like a number of deaths (locally at least, specific to what I know for my mid sized US city) will be indirectly attributable to this.


It's entirely possible (likely, even) that someone died from this, but it's hard to know with critically ill patients whether they would have survived without the added delays.


On aggregate it is. How many deaths over the average for these conditions did we see?


We are in the process of calculating this but need this 24H period to roll over so we can benchmark the numbers against a similar 24H period. Its hard to tell if the numbers we get back will even be reliable given a lot of the statistics back from today from what I can tell have been via emails or similar.


So what?


Give it like, a week before bothering to ask such questions...


I don't really understand this logic. If you are trying to determine which company (all else being equal) is more likely to be successful, would you not take the company that has validation already from a population that is more likely going to be using whatever it is that you are pitching? Essentially you are saying that getting funded by a big name is more indicative of success than consumers who will be using the service/product/etc.


Is there stuff like this for more 'modern' events? These suggestions are great and I like stuff like HH but was wondering if there was anything that covered like 1920's events or cold war or something in that same period (up until maybe 80-90's or so?)


a constitutional monarchy is almost a direct parallel to many organized crime syndicates. is that not legitimate governance?


King's bloodline rules because god say so... You're not a free man, you're a subject.

Do you see that as "legitimate"?


I'm not entirely sure what you mean, as in feed back the predictions for some length to see what the dialogue's become?


Don't worry, it was just an amusing and honestly really impractical idea :)

The idea would be to make an educated guess at where each word occurs in the video - going off the time and subtitle data from pysrt - and build a dict linking words to when they occur in the video. You could then use MoviePy and stitch together a video version of the generated dialogue, by looking up the appropriate clip for each word.


ahh that does make sense now, and think it could be very feasible with a much more complex and sort of blended NN since the .SRT file's do have the time for each subtitle phrase (i.e.

  7
  00:00:23,060 --> 00:00:24,619
  give a turnaround version
) but i am not sure the best way to go about doing something like this.


It's all just a ruse to create excitement for yc and no one was in fact invited to interview.


python pandas in an ipython notebook?


Yep, I don't get why people act like being an "employee-friendly place to work" is a perk, I've never seen a job posting or had a recruiter contact me with "well we aren't a very good place to work, but we pay 2x above to compensate."

If anything, I think I've realized from my limited experience that companies who push how great their culture is are usually the ones with really high turnover and HR is being pressured into creating an environment (on job sites and screening) that new dev's want to work at.


Usually the writeup is just supplemental to the posting's on the forum (and there is usually a 'post your solution' type deal where people walk through their process and sometimes post code) but I'm guessing it's just one of the competitions that will end up taking a month or so to gather all the interviews from the actual winners as well as BAYZ. Probably will not be too informative as they go quite into depth about the process and the linked blog post show's almost literal step by step of how to "game" kaggle leaderboards which is a huge aspect of competing (and why there is a private vs public dataset, why there are competition 'must enter by' deadlines and why people post benchmark code that will beat 25% of the current leaderboard before the competition is over and a lot of various other reasons).

Also looking at my submissions and notes from the competition (I stopped after the first week or so though) I even noted which 'groups' were most likely on the public leaderboard as it becomes easy to tell based on how your personal metric scores vs how that score results on the leaderboard since you know the evaluation metric for each competition.


This is really cool (and unfortunately I know nothing about powershell, I'm young and dumb) but I'm pretty sure this is replicable line for line with bash (and curl, maybe awk too idk). Am I wrong?


The main difference between bash/gnuutils stuff and Powershell is that every command returns an object (or an enumeration of objects), and commands can take an object (or an enumeration of objects) as input.

This lets everything implicitly understand how to access named properties without everyone having to do string parsing.

Sure, you can solve the same problems with bash/grep/awk/sed/etc - but sometimes it's a bunch simpler to solve in Powershell.


Don't use awk for parsing XML/JSON, there are excellent tools like html-xml-utils or jq that already do that well:

    $ sudo apt-get install html-xml-utils jq

    $ curl 'https://duckduckgo.com/html/?q=cake' | hxclean \
       | hxselect  .web-result:first-child .snippet

    $ curl apy.projectjj.com/listPairs | jq .responseData[0]

Though I'd typically use Python's BeautifulSoup for anything major with html, there's just too much bad html out there ;)


wow, didn't even know of jq but look's super cool. was trying to avoid using packages you would have to 'install' since it seemed that all the functionality the post I responded to used built-in's for powershell commands.


Probably not. In my experience it's not so much about things being impossible elsewhere, just that PowerShell can often be better at fiddling in a REPL until you got the results you wanted, in a way. The point where you need (or want) to upgrade to a more powerful language is IMHO earlier in bash than in PowerShell².

PowerShell handles objects like Unix utilities handle text. You get a lot more orthogonality in commands, e.g. there's just a few commands dealing with JSON, CSV, XML, etc.¹ – they mostly do just the conversion of a specific text-based format to an object list or tree representation and back. Everything else after that can be done with just the same set of core commands which deal with objects. Filtering, projecting, and iterating over object sequences is probably the most common part of PowerShell scripts and one-liners and it's a part that's useful everywhere you need the language. Note that we have a bit of that in the samples here, too. ConvertTo-JSON is used to convert a hashmap to a JSON string to use in a request, and Invoke-WebRequest already handles parsing HTML for us so the following line can just filter elements by certain attributes.

Where the ideal format for the core Unix tools you use most frequently, is free-form text, you often have special commands that work on other formats by replicating a few core tools' features on that format, e.g. a grep for JSON, a grep for XML, etc. There are others, of course, that do the conversion in a way that's friendly to text-based tools, but the representation still lacks fidelity. Finding XML elements with certain attributes quickly becomes an exercise in how to write robust regexes. Cutting up CSV by column numbers is a frequent occurrence – and since the tools do not understand the format it makes for not a pretty read in the script's code. Personal opinion here, based on lots of PowerShell written, some shell scripts written, lots of horrible things read in either (sure, awful PowerShell scripts do exist, but I'd argue that discovering a nice way is much easier where you can cut most of the ad-hoc parsers written in regex or string manipulation in pipelines).

(One last point about orthogonality of commands: ls has a few options on how to sort or output the results, for example. Sorting a list of things? That's Sort-Object's domain. Formatting a list of things? Format-Table, Format-Wide, Format-List. That's quite a bit less each individual command has to do and it's all just for nicety to the user. For working with the output programmatically you don't (well, and can't) need them at all.)

I have a few posts on SO where I tried to steer people into actually learning how the language works and that you should use the pipeline as much as possible, e.g. http://stackoverflow.com/a/7394766/73070 or http://stackoverflow.com/a/3104721/73070. That's not to say you can't write un-understandable things: http://stackoverflow.com/q/1018873/73070.

And finally, let's not forget that PowerShell exists on Windows where text-based tools are mostly useless. Want to query the event log? The registry? WMI? Good luck. Windows has a long history of having non-text formats everywhere in the system and for administration it's a bit hard to pretend they don't exist. Jeffrey Snover elaborates a bit on that here: http://stackoverflow.com/a/573861/73070.

There may be a lot of developers getting by with Unix tools on Windows, but they're not the target audience for PowerShell³. And the previous approach to scripting things on Windows servers and domains was either batch files or VBScript/JScript.

This ... uhm, got longer than anticipated and probably a lot less coherent than planned. Apologies for anything that makes no sense. I didn't have caffeine yet.

______

¹ ConvertFrom-JSON and ConvertTo-JSON for JSON for example. For CSV there are also convenience commands that directly work on files as well, so there's four of them. XML processing is built-in since .NET can do that easily already.

² I probably don't get to live the rest of the day now, I guess.

³ I am a developer, though, who uses PowerShell daily for scripting, as a shell, and a .NET playground. The PowerShell team at MS was a bit surprised once when I told them that my background was not server administration (the Scripting Games were quite focused on that part and often involved doing things with WMI or AD – stuff I rarely, if ever, do.)


Your comment was useful, so not totally in vein :-)

Just incited me to have a little look at powershell (read through [0], useful intro). It looks nice, I can definitely see the utility in have a simple object model for transferring information between processes. In nix land you get pretty good at extracting data from simple text forms, though sometimes it's harder than it should be.

One thing that jumped out at me there is the overhead of the commands.

    430ms: ls | where {$_.Name -like "*.exe"}
    140ms: ls *.exe
    27ms : ls -Filter "*.exe".
Not so much the absolute numbers but the fact that there are 3 different ways of doing it and the more flexible choice is over a magnitude slower.

What happens when you add another command to the pipeline? Do they buffer the streams like in linux?

I guess the situation will improve over time but how complete is the eco-system at the moment? One area nixes will always shine is the total ubiquity. Everything can be done over commands and everything works with text.

[0] https://developer.rackspace.com/blog/powershell-101-from-a-l...


You found three ways of doing things that all do filtering at a different level. The -Filter parameter employs filtering on the provider side¹ of PowerShell, i.e. in the part that queries the file system directly. Essentially your filter is probably passed directly to FindFirstFile/FindNextFile which means that fewer results have to travel fewer layers. The -Path parameter (implied in ls .exe as it's the first parameter) works within the cmdlet itself, which is also evident in that it supports a slightly different wildcard syntax (you can use character classes here, but not in -Filter) because it is already over in PowerShell land. The slowest option here pushes filtering yet another layer up, by using the pipeline, so you get two cmdlets, passing values to each other and that's some overhead as well, of course. Note that the most flexible option here is by combining Get-ChildItem with Where-Object, whereas the direct equivalent in Unix would probably be find which replicates some of ls' functionality² to do what it does, which probably places it in a very similar spot to ls, performance-wise.

It's not uncommon for the most flexible option to be the slowest, though. In my own tests my results were 18 ms, 115 ms and 140 ms for doing those commands in $Env:Windir\system32, so the difference wasn't as big as in your case. For a quick command on the command line I feel performance is adequate in either case, unless you're doing things with very large directories. If you handle a large volume of data, regardless of whether it's files, lines, or other objects, you probably want to filter as much as you can as close to the source as you can – generally speaking.

As for buffering ... I'm not aware of, unless the cmdlet needs to have complete output of the previous one to do its work. Every result from a pipeline is passed individually from one cmdlet to the next by default. Some cmdlets do* buffer, though, e.g. Get-Content has a -ReadCount parameter that controls buffering in the cmdlet (man gc -param readcount). Sort-Object and Group-Object are the most common (for me at least) that always need complete output of the stage before to return anything, for quite obvious reasons.

However, even though I did some work on Pash, the open-source reimplementation of PowerShell, I'm not terribly well-versed in its internal workings, so take the buffering part with a grain of salt.

As for completeness, well, the Unix ecosystem has an enormous edge here, simply by having been there for decades and amassing tools and utilities. Since PowerShell was intended for system administrators you can expect nearly everything needed there to have PowerShell-native support. This includes files, processes, services, event logs, active directory, and various other things I know little to nothing about. Get-Command -Verb Get gives you a list of things that are supported directly that way. It seems like even configuration things like network, disks and other such things are supported by now. At Microsoft there's a rule, I think, that every new configuration GUI in Windows Server has to be built on PowerShell. Which means, everything you can do in the GUI, you can do in PowerShell, and I think you can in some cases even access the script to do the changes you just made in the GUI – e.g. for doing the same change on a few hundred machines at once, or whatever.

Of course, you can just work with any collection of .NET objects by virtue of the common cmdlets working with objects (gcm -noun object). For me, whenever there is no native support, .NET is often a good escape hatch, that in many cases isn't terribly inconvenient to use. You also have more control over what exactly happens at that level, because you're one abstraction level lower. As a last resort, it's still a shell. It can run any program and get its output. Output from native programs is returned as string[], line by line, and in many cases that's not worse than with cmd or any Unix shell.

_____

¹ Keep in mind, the file system is just one provider and there are others, e.g. registry, cert store, functions, variables, aliases, environment variables that work with exactly the same commands. That's why ls is an alias for Get-ChildItem and there is no Get-File, because those commands are agnostic of the underlying provider.

² So much for do one thing – but understandable, because ls' output is not rich enough to filter for certain things further down in the pipeline.


Oh yeah, I can see how the different filtering spots would make a difference.

Was just a bit surprised at the overhead of adding a command to the pipeline. A similar setup on linux would be like the following I guess (on a folder with 4600 files, 170 matching files).

    20ms time ls *.pdf
    35ms time find -maxdepth 1 -iname '*.pdf'
    60ms time ls | egrep '.pdf$'
I was more wondering if adding each additional command added so much overhead. Your numbers looks much more reasonable. Maybe the article I read had a big step due to filesystem / caching or something.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: