Wikipedia:Reference desk/Archives/Computing/2015 June 11

Computing desk
< June 10 << May | June | Jul >> June 12 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


June 11 edit

The if statement in java edit

1.
class Something
{p.s.v.m(String as[])
{int a;
if(true){a=88;}
S.o.p(a);}}

//the output as obvious for 1 is 88.
2.
class Something
{p.s.v.m(String as[])
{int a;
int x=99;
if(x==99){a=88;}
S.o.p(a);}}

//for 2 i am getting compiler error" variable "a " might not have been initialized". So,as i get it,the if test fails and we are trying to print the value of a without initializing it which is syntactically forbidden.

3
class Something
{p.s.v.m(String as[])
{int a;
int x=99;
if(x==99){a=88;}
else{a=77;}
S.o.p(a);}}

//for 3 i am getting the value as 88.Which i cant understand as here also the if test should fail and the else block should run printing the value 77. Please, point the mistake in my logic here.The jdk version is 1.6 on windows 7. 123.238.96.199 (talk) 05:28, 11 June 2015 (UTC)[reply]

It would make no sense for the third program to print 77. You set x to 99, then immediately test (x == 99), which is true, so a will be set to 88. The third program's behavior makes sense, and your theory about why the second program fails must be wrong.
The real reason the second program won't work is that Java requires local variables to be "definitely assigned" a value before they're used, and if (x == 99) { a = 88; } doesn't "definitely assign" a value to a by those rules. At run time, it always would assign the value, but you never got to that point because the program was (correctly) rejected at compile time. -- BenRG (talk) 06:24, 11 June 2015 (UTC)[reply]
This rule is overly strict, and the language specification even admits: "even though the value ...is known at compile time, and in principle it can be known at compile time that the assignment ... will always be executed... a Java compiler must operate according to the rules laid out in this section. The rules recognize only constant expressions."
Such strictness excludes certain obvious optimizations. However, it also makes the compiler behave more correctly when handling non-obvious code: for example, bear in mind that most implementations of the JVM permit threads to execute across multiple CPUs. Because Java specifies such explicit and strictly-enforced rules about compiler treatment of constant values, entire categories of cache coherency bugs are eliminated. So: you lose some performance in the trivial case - a handful of extra machine cycles - but you might get an order-of-magnitude performance boost in non-trivial application program code because the compiler can more easily discern provably correct optimizations that will work on any software or hardware realization of the JVM. The strictness of the language makes it easy for the compiler to offload work to other threads (on other hardware instances, perhaps with incoherent memory), and strict consistency can still be guaranteed.
Our article definite assignment analysis cites several sources in which researchers have studied this exact performance tradeoff across different types of optimizing compiler.
Nimur (talk) 12:38, 11 June 2015 (UTC)[reply]
The definite assignment rules don't preclude any optimizations. A compiler must reject invalid programs according to the rules in the spec, but it doesn't have to reuse the same analysis for optimization. It can use more sophisticated techniques, including constant propagation, when optimizing valid programs. The definite assignment rules are unrelated to thread safety. They only apply to initialization of local variables and final fields, neither of which can be shared between threads. -- BenRG (talk) 03:35, 12 June 2015 (UTC)[reply]
I think you misunderstand. These strict language requirements enable processor optimizations on platforms whose hardware implementation is more complicated than that provided as the JVM machine abstraction. For example, consider a standard multi-core computer. A single thread can hop to different CPUs, and this can happen in between evaluation/execution of individual Java bytecodes! When a thread hops CPUs, its context must also move to a different physical storage location - the new core's cache, for example. This means that a locally-scoped variable in one single-threaded program's stack may physically reside in several places - in DRAM, and in the caches of one or more CPUs that will each execute the single thread at different times. In order to guarantee correct execution of a Java program on such platforms (e.g. where the CPUs do not have hardware-enforced cache coherency), the language specification must enforce extra-strict requirements, or else the language must forbid certain types of thread scheduling (precluding the VM, the operating system, or the hypervisor from scheduling a thread on the most desirable core, or precluding the thread from ever moving around: i.e. mandating that each JVM thread must have infinite CPU affinity! That might be legal, but it is not efficient or flexible, and it's not how HotSpot or any other common JVM actually implements threading). This problem becomes even more fascinating when the JVM is not virtual, but is actually implemented as a hardware pipeline, as in the case of Jazelle on certain ARM CPUs. By enforcing specific language requirements, certain hardware optimizations - like CPU register renaming and out of order execution can be proven correct and therefore guaranteed safe... which means formal verification for the purposes of fabricating silicon is possible.
The converse of this: if you only and exclusively program on machines with automatic hardware cache coherency, highly-developed operating system scheduling, and a very clean and robust system process model, you'll never see these problems and so the whole discussion seems moot. The performance advantages at stake here have already been handled by some other programmer lower down in the stack, and you're probably happy with whatever choices they made (as long as the JVM never crashes). Chances are, if this applies to you, you're probably a Java programmer who is happy letting somebody else write your thread scheduling algorithms! Nimur (talk) 06:00, 13 June 2015 (UTC)[reply]

Permissions error? edit

I work in a networked office (all various models of macs). Every time I try to open up certain Excel documents I get the message "NAME OF FILE is being modified by another user. Open as Read-Only, or choose Notify to be alerted when it is available". The document is not open anywhere else and is on my hard drive. I get around this by opening as read only, editing, then re-saving under slightly modified name, then deleting the prior doc and retitling it back (but it's a pain in the ass). This only happens with Excel documents. Moreover, if I use such particular docs as a model to create another, the same issue crops up--but does not if I create a new Excel document, or use one as a model that does not have this error. That consistency leads me to believe that it is some setting that can be fixed. Any ideas of what I might try? (This is my second thread here about an Excel issue, what a finicky program it is.) Thanks in advance.--96.246.181.46 (talk) 14:03, 11 June 2015 (UTC)[reply]

There may be a zombie lock file hanging around. Excel lock files are apparently called ~$foo.xlsx (where the document is foo.xlsx). A lock file should be cleaned up when Excel stops editing a file, but such a zombie might get left behind if there is a crash, or if the network fails. Deleting the lock file should fix the problem. But apparently this file isn't normally visible in the MacOS Finder; I don't know how to make it show them - you could find and delete them with the terminal. -- Finlay McWalterTalk 14:12, 11 June 2015 (UTC)[reply]
Share your pain, OP. I get very similar Excel issues with our (Windows) office network, typically when I save or close an Excel document I've been working on for some time. It usually tells me that the other person already using the file is – me!
My solution is similar to yours – 'save as' to the desktop, close everything and reboot my PC (else the problem persists), go to the file on the drive on the server and delete the existing non-updated version, substitute the 'saved as' version and rename it to the original name (necessary as several other people also have desktop shortcuts to the files concerned on the server). Sometimes, however, Excel as a whole locks up and I can't even 'save as'.
I'm not the only one in the office who's suffered this, but it seems to happen mostly in the evening, and I work late more often that my co-workers.
This either first arose or got markedly worse since we upgraded to Office 365. Our IT Department has looked at this over a period of 9 months and couldn't come up with an sure explanation or cure. On their advice I now open all Excel documents in 'Safe Mode', but it still happens occasionally. It's been suggested that it's when bandwidth is being reduced by other (housekeeping?) traffic (we haven't been able to get fibre installed due to a contractual dispute between BT and the Landlord, despite it being promised when we moved in 18 months ago) and some commands get their order scrambled, but that may be poppycock for all I know – I'm not an IT expert, that's what we pay the IT staff for!. {The poster formerly known as 87.81.230.195} 212.95.237.92 (talk) 13:34, 12 June 2015 (UTC)[reply]

How to clear a windows python console without using os.system("CLS")? edit

Hello, I would like to know how to clear a windows based python console without using os.system("CLS"). I tried to to do this with sys.stdout but failed. Anyone have any ideas on how to do this? Thanks in advance. —SGA314 (talk) 15:39, 11 June 2015 (UTC)[reply]

Print the "Erase Display" ANSI escape code -- Finlay McWalterTalk 16:21, 11 June 2015 (UTC)[reply]
How do I write the "Erase Display" escape code? I tried ESC[CSI 2 J] but it didn't work. what did I do wrong? —SGA314 (talk) 16:35, 11 June 2015 (UTC)[reply]
So how would I print an ASCII escape character in python? —SGA314 (talk) 17:15, 11 June 2015 (UTC)[reply]
The Windows console doesn't support ANSI escape codes. -- BenRG (talk) 00:09, 12 June 2015 (UTC)[reply]
Python also supports ncurses by way of the curses module! Official documentation for this feature is written by none other than Eric S. Raymond.
With curses, you can "clear the terminal" (blanking previous terminal content) using newwin. The original terminal content can later be restored. This behavior is distinct from the "cls" command.
Nimur (talk) 18:20, 11 June 2015 (UTC)[reply]
Already tried that. I can't import the curses modules because there is no _curses module present. —SGA314 (talk) 19:01, 11 June 2015 (UTC)[reply]
This will do it (better error checking left as an exercise):
from ctypes import byref, c_uint16, c_uint32, windll, Structure
from ctypes.wintypes import _COORD, SMALL_RECT

class CONSOLE_SCREEN_BUFFER_INFO(Structure):
    _fields_ = (("dwSize", _COORD),
                ("dwCursorPosition", _COORD),
                ("wAttributes", c_uint16),
                ("srWindow", SMALL_RECT),
                ("dwMaximumWindowSize", _COORD))

STD_OUTPUT_HANDLE = -11

def clear_screen(attr=0x0F):  # 0x0F is white foreground, black background
    hConsoleOutput = windll.kernel32.GetStdHandle(STD_OUTPUT_HANDLE)
    csbi = CONSOLE_SCREEN_BUFFER_INFO()
    if windll.kernel32.GetConsoleScreenBufferInfo(hConsoleOutput, byref(csbi)):
        num_cells = csbi.dwSize.X * csbi.dwSize.Y
        windll.kernel32.FillConsoleOutputCharacterA(hConsoleOutput, ord(' '), num_cells, _COORD(0, 0), byref(c_uint32()))
        windll.kernel32.FillConsoleOutputAttribute(hConsoleOutput, attr, num_cells, _COORD(0, 0), byref(c_uint32()))
        windll.kernel32.SetConsoleCursorPosition(hConsoleOutput, _COORD(0, 0))
-- BenRG (talk) 00:09, 12 June 2015 (UTC)[reply]
That is absolutely genius BenRG! Can you explain how it works? Thank you sooo much! —SGA314 (talk) 14:09, 12 June 2015 (UTC)[reply]
Huh. I didn't know that windows doesn't support ANSI escape codes. Good to know. —SGA314 (talk) 14:20, 12 June 2015 (UTC)[reply]
Don't praise it—it's horrible code. I made it more readable (though longer). GetStdHandle gets a handle to the console (only if the standard output isn't redirected!), GetConsoleScreenBufferInfo gets the width and height of the console (and other information we don't care about), num_cells is the total number of character cells in the console, FillConsoleOutputCharacterA writes that many spaces starting at the upper left of the buffer (0,0), FillConsoleOutputAttribute does the same with the character attributes (foreground and background color), and SetConsoleCursorPosition moves the cursor to the upper left of the buffer. -- BenRG (talk) 05:39, 13 June 2015 (UTC)[reply]
It looks good to me. Thanks for explaining it to me. —SGA314 I am not available on weekends (talk) 15:19, 15 June 2015 (UTC)[reply]

Which tools automatically suggest the creation of new aliases or shell functions by analyzing a user's command history? edit

Thanks. Apokrif (talk) 16:02, 11 June 2015 (UTC)[reply]

Number 1. what or are you using for your command session? Windows, Mac OSX, Linux what? And number 2 could you explain the question in more detail. —SGA314 (talk) 16:10, 11 June 2015 (UTC)[reply]
I'm looking for such tools working with any shell on any platform (strangely enough, a quick Google search returned nothing).
For instance, if a command like cp ~/foobar/*.txt~ /baz pops up 10 times a day in the command history, the tool will automatically create, or suggest the creation of, an alias with a short name that does the same thing (if the tool is smart enought, it will even suggest names like ToBaz or TxtBak). Apokrif (talk) 16:21, 11 June 2015 (UTC)[reply]
Well I don't know about cross platform but I do have an idea how to do this with a python console. —SGA314 (talk) 16:23, 11 June 2015 (UTC)[reply]
Cross-platform? Perhaps my question was misphrased: I am looking for all such tools, no matter with which shell or on which platform they work, not tools which can run with all shells on all platforms. Apokrif (talk) 18:21, 11 June 2015 (UTC)[reply]
If you accept Linux as an operating system (though it appears you are using Windows, or you wouldn't ask this question, and most people think that mentioning Linux is taboo, but you have stated that you want a cross-platform answer)... it is not "normal" to have a tool that does something that a few common tools do. For example, if I want to see the 10 most common things I type in bash, I run: grep -v "^#" ~/.bash_history | sort | uniq -c | sort -n | tail -n 10. Because I actually use the command line, I don't look at that and start crying like a baby. I look at it and I see that I get everything that is not a comment, I sort it, I count the unique lines, I sort it numerically, and I get the top 10 lines. I could write an alias that does all of that if I wanted. I wouldn't write a program to do it because it already exists and it sitting right there waiting for me to use. So, that meets the "show me the most common commands I use so I can make an alias for them" part of your question. If you want suggestions for names for the alias, you can pipe the output of the existing command into sed to use a regular expression to remove spaces, slashes, etc... and perhaps remove duplicated letters. So, www would become w. 209.149.113.240 (talk) 16:29, 11 June 2015 (UTC)[reply]
I don't get your point(s). Are you trying to prove that it is not normal to use computers to perform boring and repetitive tasks? Apokrif (talk) 18:21, 11 June 2015 (UTC)[reply]
I think ...240 is trying to say that you shouldn't necessarily expect such a specific program to exist, but if you use the example grep command, that basically functions as a "program" that will suggest what your most common commands are (or the most common commands that use an option, etc), and you can use that output to come up with candidates for good time-saving aliases. In other words ...240 is saying the "program" for any platform that has grep and a command history is just the appropriate grep command. Furthermore, it shouldn't be too hard to wrap that in a little Bash script, use chron to run it once a week, and pipe to sed for the alias naming. So a skilled Unix user (not me) should be able to come up with a script that meets your needs in relatively little time and code. People often share lists of their favorite aliases (e.g. [1]), but I agree with ...240's suggestion that there may not be such a program for Unix-like systems, because it is relatively easy to just make aliases manually, browse others idea, or write a simple script if you want programmatic suggestions. SemanticMantis (talk) 18:38, 11 June 2015 (UTC)[reply]
My main point was that the mindset of a command-line user (especially in Linux) is to use the tools provided to do complicated things. It is not common for a command-line user to ask for a specific program to perform every little task. For example, I don't ask for a program to find the biggest file in a folder. I use du and sort. Therefore, I wouldn't write a program to do what is commonly called a "one-liner." I wouldn't expect anyone else to write such a program either. How you get your snarky "it is not normal to use computers to perform boring and repetitive tasks" is difficult to understand. I specifically stated that I have my computer do repetitive tasks without bothering me. 209.149.113.240 (talk) 19:36, 11 June 2015 (UTC)[reply]
There is a long and deeply-philosophical argument about whether a command-line (or any other user interface) should do what it believes you meant, or do only and exactly what you say, and not do anything more or less. This argument is decades old and has pervaded system- and application- design since the earliest days of computer software.
A command shell that intelligently guesses your intent is potentially very useful. There do exist such command-shells. The first that comes to mind is dash, a bash-like shell (with extensions created by Canonical) and built from the same technology as the Dash/Unity UI. If you type a command and it can't be executed or evaluated, the shell will recommend places to download software that might fulfill your needs. This is not the only such "helpful" command line interface; but it is probably the most commonly-used version today.
This advanced behavior is very neat and helps many users. Other users - including myself - find that this kind of "user-friendly" behavior deeply violates the principle of least astonishment. It is self-evident to programmers like us that we could make a command-line tool to do that, but if we needed it, we would make it ourselves, and if somebody else makes it for us, their heuristic choices will conflict with our own expectations.
I wouldn't read into this attitude as "snarky" or "curmudgeonly." It's simply a matter of a different use-case. Other users wish to have intelligent agent user-interfaces that automatically change their behaviors to make mundane tasks "easier." I find such automatic behavior to be unpredictable; and in my evaluation, I value predictability at a much higher premium than "ease-of-use." Nimur (talk) 22:07, 11 June 2015 (UTC)[reply]
I only stated that it was "snarky" because the response was nearly the exact opposite of my statement. 209.149.113.240 (talk) 12:56, 12 June 2015 (UTC)[reply]

For Emacs, Keyfreq could help. Apokrif (talk) 15:52, 25 September 2018 (UTC)[reply]

Dealing with a big file > 100 MB in Excel edit

How to deal with an Excel file, which is rather big, with thousands of rows? Couldn't Excel just pick a row, process, go to the next, pick another row, process, go to the next, and so on? That would make the file size or number of rows irrelevant, although you would have to stick to a certain maximum number of columns still. At any given point in time Excel would just need to store one row and keep track of variables--Llaanngg (talk) 18:36, 11 June 2015 (UTC).[reply]

Are you asking how Excel handles such things internally? Or maybe you are asking what you can do if Excel won't open a given file? Or something else? It's unclear to me, but Excel 2013 should be able to handle 100mb files, unless you are in a very memory-limited environment. It is hard limited at 1,048,576 rows by 16,384 columns though, as described here [2]. You might be interested in this thread [3], which contains some discussion and links on how to deal with a 2.7 GB file, which is much closer to a modern notion of "big". SemanticMantis (talk) 18:49, 11 June 2015 (UTC)[reply]
I was asking how to deal with it. The files are above 100 MB, which does not mean just a little bit above 100 MB, they can perfectly be 250 MB. The processing starts to get noticeable slow at 100 MB and that includes opening, changing, searching and closing, although the Excel files are well below this hard limit of 1,048,576 rows. It is Excel 2010.
Researching about it makes me see that there are many potential problems for a slow processing of a Excel file, which maybe would be best have been tackled upfront, when deciding how the worksheet needed to be. I imagine no one will be able to provide a simple solution that could speed any worksheet. So, maybe someone could explain how Excel is handling the file internally. Is it doing it in a way that's cumbersome? I don't have any problem opening pdf files bigger than 100 MB, or to open more then one of them. --Llaanngg (talk) 19:40, 11 June 2015 (UTC)[reply]
I also deal with large Excel spreadsheets with up to 4000-odd rows and up to 70 columns. They're usually no problem, but on occasion a copy sent to me to work on is sluggish, and it seems to be that it's looking at the entire maximum possible spreadsheet size, not just the fraction actually used, so that CTRL+End will go, not to say Row 2263 Col AY, but to Row bazillion Col ZZZ... (can't find an actual example just now).
One suggestion is that somehow a cell entry has appeared/been made way out in the boondock cells, and to try deleting everything outside the actual working territory, but this can be awfully tedious to do. {The poster formerly known as 87.83.230.195} 212.95.237.92 (talk) 13:47, 12 June 2015 (UTC)[reply]
Some thoughts:
1) Reading data from files is much slower than accessing it from memory (RAM).
2) Therefore, most programs will attempt to load all the data into RAM first, then work with that.
3) This can make the initial loading slow. Some programs will attempt to only read enough to get started first, then read the rest while you are using the first part. That might not work very well for a spreadsheet, though, if it's trying to perform operations that require all the data, like finding totals.
4) If the program runs out of RAM, then it will use paging space, which is extremely slow. You might want to monitor your memory usage and see if the program gets slow when it approaches 100%. If so, then more RAM might be needed.
5) The other option would be to break up the file into smaller files. For example, if it's a customer database, maybe you could have an A-M file and an N-Z file. This would apply if you always look up customers by name. If you always looked them up by customer number, then you might break it up that way. If you look them up in multiple ways, then splitting the file up will be more problematic. If you can describe the data and how you look up rows, perhaps we can suggest a good way to break it up.
6) If Excel is written poorly, it might allocate space for empty columns and rows, too. If so, then deleting those might help.
7) You might also have an inefficient use of cells, for example if a timestamp or number is stored in a character field. Fixing that might decrease the file size significantly.
StuRat (talk) 14:27, 12 June 2015 (UTC)[reply]
@Llaanngg: You might be able to increase performance by trying some of these suggestions to reduce file size [4]. I would additionally try writing out as .csv to strip all the formatting nonsense you probably don't need, then re-importing it to excel. SemanticMantis (talk) 17:58, 12 June 2015 (UTC)[reply]
How many rows/columns are you dealing with? I routinely work with spreadsheets with between 100,000 and 400,000 rows, and while it isn't blazing fast, most operations that affect the whole spreadsheet (e.g. sorting a column, changing formats) take 5 - 10 seconds. I'll grant that I only have a dozen or so rows, but you might want to examine how Windows is allocating memory. I'll also second the suggestion to make sure Excel doesn't think there is data in (say) column ZZ, or row 1,048,576; this WILL slow things down. If this is the case, then it's best to copy the spreadsheet to a new tab, since Excel "remembers" that a cell used to have data in it, and never seems to consistently update its end-of-record markers, even when the data is no longer there. OldTimeNESter (talk) 19:24, 12 June 2015 (UTC)[reply]