Good day. I'm rather new here and I'm not really an expert at using PHP, but I was wondering if anyone of the gurus here would be kind enough to help me with something... or at least be willing to point me in the right direction.
See, I'm trying to make an online community based document archive for a university... and the real core of the site is being able to upload documents so that everyone else may read and bathe in their wisdom... but not copy the document so that your work (or anyone else's, for that matter) may be plagiarized. Of course, making something completely copy-proof is a bit difficult, but I would at least want to do something in order to make life a lot more difficult for those that love to copy other people's hard work.
Considering that, I know that .PDFs are just the thing that I need. PDFs can be encoded in such a way that they don't even allow people to select the text within... let alone copy the damn thing. So if someone were to plagiarize that kind of document, they would either need to decrypt the password protecting the ability to select and/or copy the text or painstakingly read the whole damn thing and copy it manually. Which, in that case... one might as well just study it and see what kind of new knowledge may be gained from such a document and work from that point forward.... which is basically the "gist" of the whole site.
Anyway, to keep things on target, what I'm asking is what do I need to do to make this a reality? What can I do, what codes, APIs, scripts or ANYTHING can I implement in the page in order for documents uploaded in a .doc or .docx (Damn Office2007 and their format change) format to be converted to .PDFs and copy-protected?
Can this be done in PHP? Is there some script someone's made? I'm REALLY not very good at using PHP, but I'm trying to restrict myself to using free-licensed software in this thing... PHP, MySQL, that sort of thing.
I would be LUDICROUSLY grateful to anyone that might know about this little problem I've got... Thank you all for reading this!
Is it possible? .Docs to .PDFs?
Moderator: General Moderators
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Is it possible? .Docs to .PDFs?
You could exec() OpenOffice or Adobe's Acrobat from PHP. Both will do .doc to .pdf conversion.
(#10850)
Re: Is it possible? .Docs to .PDFs?
Well, that is true... I could execute the program from the page within the server...
But I'm not all too versed on how to do the conversion "automatically" from the website
What I mean is that I can do the exec() OpenOffice every time that I do an upload... but once that happens, how can I make it go through the process of converting it and at the same time modifying it so that it's copy protected...? o_o
I'm really at a loss here, you'll have to excuse my ignorance.
But I'm not all too versed on how to do the conversion "automatically" from the website
What I mean is that I can do the exec() OpenOffice every time that I do an upload... but once that happens, how can I make it go through the process of converting it and at the same time modifying it so that it's copy protected...? o_o
I'm really at a loss here, you'll have to excuse my ignorance.
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Is it possible? .Docs to .PDFs?
So you want information on what settings to use for OpenOffice or Acrobat to create documents whose contents cannot be easily copied?
(#10850)
Re: Is it possible? .Docs to .PDFs?
I'm afraid you may have misunderstood what arborint said. He was pointing out that .pdf documents are not at all difficult to convert and/or copy. Anyone who has MS Word or OpenOffice or visits http://zamzar.com or any number of other ways, can easily copy from a .pdf file.Etamnanki wrote:Well, that is true... I could execute the program from the page within the server...
But I'm not all too versed on how to do the conversion "automatically" from the website
What I mean is that I can do the exec() OpenOffice every time that I do an upload... but once that happens, how can I make it go through the process of converting it and at the same time modifying it so that it's copy protected...? o_o
I'm really at a loss here, you'll have to excuse my ignorance.
The bottom line is that anything that is readable is potentially copiable. You might want to consider registering users so that at least you could track what your users may do. I would hold absolutely no hope that you can even make it difficult for people to copy anything that you want to post on a web site.
Re: Is it possible? .Docs to .PDFs?
I do appreciate your views... like I said, it's always going to be copyable and it might not be all that difficult to do. I'm not out to re-write the book on anti-plagiarism... I guess I haven't really explained how the site works.
It's actually part of my grad paper on how more effective it would be for students and teachers alike in my university to freely share knowledge over a network than simply archiving all of it in some file cabinet somewhere and have to go through MOUNTAINS of procedure just to read someone else's paper in order to get some sort of reference on whatever subject someone might be investigating. My idea revolves around a community effort in my university to upload documents, evaluation of said documents and ratings to the same. First and foremost, for anyone to do anything in that page they must be registered. Doing so will give each user access based on what kind of account or "role" they play in the university. Students may upload and read documents at their will, as well as give ratings and comment on specific documents that are previously approved by professors to be suitable for publication. Professors can upload, read and evaluate documents, which also grants them the ability to publish the documents once they have been revised and deemed useful for publication. They are also responsible for creating groups associated with documents in order to help maintain the authorship of any paper... since most of them will be co-authored by many people anyway.
The crux of my inquiry comes when people are going to download a document for view. See, the object of the site is to share knowledge... not to outright copy it so that Jimmy McCopypaste won't have the leisure of simply copying someone else's paper on Differential Calculus when they download it. Instead, what he will be able to do is read the thing thoroughly and see how its done so that he might have a slightly increased chance of doing his paper legitimately. Of course, this tactic is not going to be fool-proof, like I said before. All that I want to do is give those that upload documents into the university's server the slightly increased chance of securing their authorship of whatever they uploaded in there. I might be misunderstanding the message... and if I am, I really apologize to all of you and ask that you excuse my ignorance. I know that I can convert my own .doc files into PDF using OpenOffice or a myriad of other tools that are available online... I just need some pointers on how to implement them in a website using PHP or Javascript or whatever's available so that once a person uploads a .DOC (or .DOCX... friggin' MS) file into the database, it'll automatically go through the conversion process and end up being stored in there as a .PDF with at least some kind of copy-protection (I know it might be a moot point... but for the sake of presenting this thing to the less computer-literate denizens of my university, bear with me...). So that when whoever wants to read the document downloads it, they will get the PDF and not the original .DOC file which can be easily manipulated even by a paraplegic monkey with down syndrome.
Sorry again if I making impertinent questions or if my analogies aren't the best around... I'm just a bit stressed, but I shouldn't take it out on anyone. I appreciate you for taking the time to even read this, I really do.
P.S: Sorry again if I'm being a little thick by asking this... but if I don't do such conversions, then what can I do at all to make it "slightly" more difficult to copy/modify? I'm open to any kind of suggestion...
It's actually part of my grad paper on how more effective it would be for students and teachers alike in my university to freely share knowledge over a network than simply archiving all of it in some file cabinet somewhere and have to go through MOUNTAINS of procedure just to read someone else's paper in order to get some sort of reference on whatever subject someone might be investigating. My idea revolves around a community effort in my university to upload documents, evaluation of said documents and ratings to the same. First and foremost, for anyone to do anything in that page they must be registered. Doing so will give each user access based on what kind of account or "role" they play in the university. Students may upload and read documents at their will, as well as give ratings and comment on specific documents that are previously approved by professors to be suitable for publication. Professors can upload, read and evaluate documents, which also grants them the ability to publish the documents once they have been revised and deemed useful for publication. They are also responsible for creating groups associated with documents in order to help maintain the authorship of any paper... since most of them will be co-authored by many people anyway.
The crux of my inquiry comes when people are going to download a document for view. See, the object of the site is to share knowledge... not to outright copy it so that Jimmy McCopypaste won't have the leisure of simply copying someone else's paper on Differential Calculus when they download it. Instead, what he will be able to do is read the thing thoroughly and see how its done so that he might have a slightly increased chance of doing his paper legitimately. Of course, this tactic is not going to be fool-proof, like I said before. All that I want to do is give those that upload documents into the university's server the slightly increased chance of securing their authorship of whatever they uploaded in there. I might be misunderstanding the message... and if I am, I really apologize to all of you and ask that you excuse my ignorance. I know that I can convert my own .doc files into PDF using OpenOffice or a myriad of other tools that are available online... I just need some pointers on how to implement them in a website using PHP or Javascript or whatever's available so that once a person uploads a .DOC (or .DOCX... friggin' MS) file into the database, it'll automatically go through the conversion process and end up being stored in there as a .PDF with at least some kind of copy-protection (I know it might be a moot point... but for the sake of presenting this thing to the less computer-literate denizens of my university, bear with me...). So that when whoever wants to read the document downloads it, they will get the PDF and not the original .DOC file which can be easily manipulated even by a paraplegic monkey with down syndrome.
Sorry again if I making impertinent questions or if my analogies aren't the best around... I'm just a bit stressed, but I shouldn't take it out on anyone. I appreciate you for taking the time to even read this, I really do.
P.S: Sorry again if I'm being a little thick by asking this... but if I don't do such conversions, then what can I do at all to make it "slightly" more difficult to copy/modify? I'm open to any kind of suggestion...
Re: Is it possible? .Docs to .PDFs?
First point: yes, PHP, with appropriate libraries, CAN generate .pdf documents programmatically, but (1) it takes a LOT of programming skill and the right library resources on the server, and (2) when you start worrying about the new MS Word 2007 format, all bets are off, and (3) it still gets you about 0.5% of the way toward making it difficult for users to copy the material. That's why I suggested a focus on the human engineering side of it, rather than technology, which isn't going to help you very much in this effort.
You obviously understand all the pertinent issues. I remember a hundred years ago when I was working on a thesis, how frustrating it was to encounter seeming obstacles to my goal. You have my empathy. I will suggest, however, that you direct your focus on the human engineering, rather than "copy protection" aspects of the solution, since the latter can give you next to no protection at all.
You obviously understand all the pertinent issues. I remember a hundred years ago when I was working on a thesis, how frustrating it was to encounter seeming obstacles to my goal. You have my empathy. I will suggest, however, that you direct your focus on the human engineering, rather than "copy protection" aspects of the solution, since the latter can give you next to no protection at all.
Re: Is it possible? .Docs to .PDFs?
I think the conversion from word to pdf is a good idea, but the copy protection is, I fear, not going to help you out too much. There are just too many workarounds that require very little effort or thought to break.
I think that perhaps it might be less problematic to design a back end system to compare a document submitted by little Jimmy McCopypaste against the stored documents of the same category. Some heavy lifting involved for the server, but I'm sure with a bit of thought and a good scheduler, a report could be generated with a similarity ranking to a short list of other documents.
Just a thought.
I think that perhaps it might be less problematic to design a back end system to compare a document submitted by little Jimmy McCopypaste against the stored documents of the same category. Some heavy lifting involved for the server, but I'm sure with a bit of thought and a good scheduler, a report could be generated with a similarity ranking to a short list of other documents.
Just a thought.